TiTok: Transfer Token-level Knowledge via Contrastive Excess to Transplant LoRA

View PDF HTML (experimental)

Abstract:Large Language Models (LLMs) are widely applied in real world scenarios, yet fine-tuning them comes with significant computational and storage costs. Parameter-Efficient Fine-Tuning (PEFT) methods such as LoRA mitigate these costs; however, the adapted parameters are dependent on the base model and cannot be transferred across different backbones. One way to address this issue is through knowledge distillation, but its effectiveness inherently depends on training data. Recent work such as TransLoRA avoids this by generating synthetic data; nevertheless, this adds complexity since it requires training an additional discriminator model. In this paper, we propose TiTok, a new framework that enables effective LoRA Transplantation through Token-level knowledge transfer. Specifically, TiTok captures task-relevant information through a token-wise contrastive excess between a source model with and without LoRA. This excess highlights informative tokens and enables selective filtering of synthetic data, all without additional models or overhead. Through experiments on three benchmarks across multiple transfer settings, we demonstrate that TiTok is consistently effective, achieving average performance gains of +4~10% compared to baselines overall.

Comments:	ICLR 2026
Subjects:	Computation and Language (cs.CL); Artificial Intelligence (cs.AI)
Cite as:	arXiv:2510.04682 [cs.CL]
	(or arXiv:2510.04682v3 [cs.CL] for this version)
	https://doi.org/10.48550/arXiv.2510.04682 arXiv-issued DOI via DataCite

Submission history

From: ChanJoo Jung [view email]
[v1] Mon, 6 Oct 2025 10:47:22 UTC (678 KB)
[v2] Sat, 28 Feb 2026 14:47:43 UTC (750 KB)
[v3] Thu, 14 May 2026 06:31:22 UTC (737 KB)