Abstract:As large language models (LLMs) continue to scale in size, the computational overhead has become a major bottleneck for task-specific fine-tuning. While low-rank adaptation (LoRA) effectively curtails this cost by confining the weight updates to a low-dimensional subspace, such a restriction can hinder effectiveness and slow convergence. This contribution deals with these limitations by accumulating progressively a high-rank weight update from consecutive low-rank increments. Specifically, the per update optimal low-rank matrix is identified to minimize the loss function and closely approximate full fine-tuning. To endow efficient and seamless optimization without restarting, this optimal choice is formed by appropriately scaling the columns of the original low-rank matrix. Rigorous performance guarantees reveal that the optimal scaling can be found analytically. Extensive numerical tests with popular LLMs scaling up to 12 billion parameters demonstrate a consistent performance gain and fast convergence relative to state-of-the-art LoRA variants on diverse tasks including natural language understanding, commonsense reasoning, and mathematical problem solving.
| Comments: | Accepted to ICML 2026 |
| Subjects: | Machine Learning (cs.LG) |
| Cite as: | arXiv:2510.23818 [cs.LG] |
| (or arXiv:2510.23818v2 [cs.LG] for this version) | |
| https://doi.org/10.48550/arXiv.2510.23818 arXiv-issued DOI via DataCite |
Submission history
From: Yilang Zhang [view email]
[v1]
Mon, 27 Oct 2025 19:59:46 UTC (1,809 KB)
[v2]
Thu, 14 May 2026 15:36:40 UTC (3,231 KB)
