LaX: Boosting Low-Rank Training of Foundation Models via Latent Crossing

Ruijie (Ray) Zhang, Ziyue (Alvin) Liu, Zhengyang Wang, Zheng Zhang

Advances in Neural Information Processing Systems 38 (NeurIPS 2025) Main Conference Track

Training foundation models such as ViTs and LLMs requires tremendous computing cost. Low-rank matrix or tensor factorization offers a parameter-efficient alternative, but often downgrades performance due to the restricted parameter space. In this work, we introduce ${\textbf{Latent Crossing (LaX)}}$ -- a simple yet effective plug-and-play module that enhances the capacity of low-rank models by enabling information flow across low-rank subspaces. We extensively validate the benefits of LaX on pre-training tasks with ViT-Base/Large and LLaMA-like models ranging from 60M to 1B parameters. LaX boosts low-rank model performance to match or exceed the full-rank baselines while using 2-3$\times$ fewer parameters. When equipped with low-rank adapters (i.e., LoRA) for fine-tuning LLaMA-7/13B, LaX consistently improves performance on arithmetic and common sense reasoning tasks with negligible cost.