LINE: LLM-based Iterative Neuron Explanations for Vision Models

View PDF HTML (experimental)

Abstract:Interpreting individual neurons in deep neural networks is a crucial step towards understanding their complex decision-making processes and ensuring AI safety. Despite recent progress in neuron labeling, existing methods often limit the search space to predefined concept vocabularies or produce overly specific descriptions that fail to capture higher-order, global concepts. We introduce LINE, a novel, training-free iterative approach tailored for open-vocabulary concept labeling in vision models. Operating in a strictly black-box setting, LINE leverages a large language model and a text-to-image generator to iteratively propose and refine concepts in a closed loop, guided by activation history. LINE achieves state-of-the-art performance across multiple model architectures, yielding AUC improvements of up to 0.11 on ImageNet and 0.05 on Places365, while discovering, on average, 27% of new concepts missed by predefined vocabularies. Beyond identifying the top concept, LINE provides a complete generation history, enabling polysemanticity evaluation and producing visual explanations that rival gradient-dependent activation maximization methods. The source code will be made available soon.

Subjects:	Computer Vision and Pattern Recognition (cs.CV); Artificial Intelligence (cs.AI); Machine Learning (cs.LG)
Cite as:	arXiv:2604.08039 [cs.CV]
	(or arXiv:2604.08039v2 [cs.CV] for this version)
	https://doi.org/10.48550/arXiv.2604.08039 arXiv-issued DOI via DataCite

Submission history

From: Vladimir Zaigrajew [view email]
[v1] Thu, 9 Apr 2026 09:43:26 UTC (26,931 KB)
[v2] Tue, 12 May 2026 21:50:33 UTC (35,370 KB)