TRIM: Token-wise Attention-Derived Saliency for Data-Efficient Instruction Tuning

View PDF HTML (experimental)

Abstract:Instruction tuning is essential for aligning large language models (LLMs) to downstream tasks and commonly relies on large, diverse corpora. However, small, high-quality subsets, known as coresets, can deliver comparable or superior results, though curating them remains challenging. Existing methods often rely on coarse, sample-level signals like gradients, an approach that is computationally expensive and overlooks fine-grained features. To address this, we introduce TRIM (Token Relevance via Interpretable Multi-layer Attention), a forward-only, token-centric framework. Instead of using gradients, TRIM operates by matching underlying representational patterns identified via attention-based "fingerprints" from a handful of target samples. Such an approach makes TRIM highly efficient and uniquely sensitive to the structural features that define a task. Coresets selected by our method consistently outperform state-of-the-art baselines by up to 9% on downstream tasks and even surpass the performance of full-data fine-tuning in some settings. By avoiding expensive backward passes, TRIM achieves this at a fraction of the computational cost. These findings establish TRIM as a scalable and efficient alternative for building high-quality instruction-tuning datasets.

Subjects:	Computation and Language (cs.CL); Machine Learning (cs.LG)
Cite as:	arXiv:2510.07118 [cs.CL]
	(or arXiv:2510.07118v3 [cs.CL] for this version)
	https://doi.org/10.48550/arXiv.2510.07118 arXiv-issued DOI via DataCite

Submission history

From: Manish Nagaraj [view email]
[v1] Wed, 8 Oct 2025 15:11:04 UTC (10,637 KB)
[v2] Wed, 28 Jan 2026 14:13:19 UTC (11,793 KB)
[v3] Thu, 14 May 2026 16:54:35 UTC (11,794 KB)