Abstract:Generating adversarial examples at scale is a core primitive for robustness evaluation, adversarial training, and red-teaming, yet even "fast" attacks such as FGSM remain throughput-limited by the cost of a backward pass. We introduce a family of attacks that eliminates the backward pass by predicting the input gradient from forward-pass hidden states via a lightweight linear regression. The approach is motivated by a kernel view of neural networks and is exact in the Neural Tangent Kernel regime, while remaining effective for practical finite-width models. Empirically, our methods recover much of FGSM's attack performance while using only a small fraction of the time, corresponding to a $532\%$ increase in throughput. These results suggest gradient prediction as a simple and general route to significantly faster adversarial generation under realistic wall-clock constraints.
| Comments: | 17 pages |
| Subjects: | Machine Learning (cs.LG) |
| Cite as: | arXiv:2605.14868 [cs.LG] |
| (or arXiv:2605.14868v1 [cs.LG] for this version) | |
| https://doi.org/10.48550/arXiv.2605.14868 arXiv-issued DOI via DataCite (pending registration) |
Submission history
From: Nicolò Felicioni [view email]
[v1]
Thu, 14 May 2026 14:16:51 UTC (191 KB)
