Abstract:Training data attribution (TDA) identifies which training examples most influenced a model's prediction. Influence function methods are a theoretically grounded family of TDA methods and exploit gradients. To overcome the scalability challenge arising from gradient computation, the most popular strategy is random projection (e.g., TRAK, LoGRA). However, this still faces two bottlenecks when scaling to large training sets and high-quality attribution: \emph{(i)} storing and loading projected per-example gradients for all $N$ training examples, where query latency is dominated by I/O; and \emph{(ii)} forming the $D \times D$ inverse Hessian approximation, which costs $O(D^2)$ memory. Both bottlenecks scale with the projection dimension $D$, yet increasing $D$ is necessary for attribution quality -- creating a quality--scalability tradeoff. We introduce \textbf{LoRIF} (\textbf{Lo}w-\textbf{R}ank \textbf{I}nfluence \textbf{F}unctions), which exploits low-rank structures of gradient to address both bottlenecks. First, we store rank-$c$ factors of projected per-example gradients rather than full matrices, reducing storage and query-time I/O from $O(D)$ to $O(c\sqrt{D})$ per layer per sample. Second, we use truncated SVD with the Woodbury identity to approximate the inverse Hessian term in an $r$-dimensional subspace, reducing memory from $O(D^2)$ to $O(Dr)$. On models from 0.1B to 70B parameters trained on datasets with millions of examples, LoRIF achieves up to 20$\times$ storage reduction and query-time speedup compared to LoGRA, while matching or exceeding its attribution quality. LoRIF makes gradient-based TDA practical at frontier scale.
| Subjects: | Machine Learning (cs.LG) |
| Cite as: | arXiv:2601.21929 [cs.LG] |
| (or arXiv:2601.21929v2 [cs.LG] for this version) | |
| https://doi.org/10.48550/arXiv.2601.21929 arXiv-issued DOI via DataCite |
Submission history
From: Shuangqi Li [view email]
[v1]
Thu, 29 Jan 2026 16:18:34 UTC (264 KB)
[v2]
Wed, 13 May 2026 19:57:11 UTC (627 KB)
