Abstract:Generative text-to-image models are advancing at an unprecedented pace, continuously shifting the perceptual quality ceiling and rendering previously collected labels unreliable for newer generations. To address this, we present ELIQ, a Label-free Framework for Quality Assessment of Evolving AI-generated Images. Specifically, ELIQ focuses on visual quality and prompt-image alignment, automatically constructs positive and aspect-specific negative pairs to cover both conventional distortions and AIGC-specific distortion modes, enabling transferable supervision without human annotations. Building on these pairs, ELIQ adapts a pre-trained multimodal model into a quality-aware critic via instruction tuning and predicts two-dimensional quality using lightweight gated fusion and a Quality Query Transformer. Experiments across multiple benchmarks demonstrate that ELIQ consistently outperforms existing label-free methods, generalizes from AI-generated content (AIGC) to user-generated content (UGC) scenarios without modification, and paves the way for scalable and label-free quality assessment under continuously evolving generative models. The code will be released upon publication.
| Subjects: | Computer Vision and Pattern Recognition (cs.CV); Artificial Intelligence (cs.AI); Multimedia (cs.MM) |
| Cite as: | arXiv:2602.03558 [cs.CV] |
| (or arXiv:2602.03558v2 [cs.CV] for this version) | |
| https://doi.org/10.48550/arXiv.2602.03558 arXiv-issued DOI via DataCite |
Submission history
From: Xinyue Li [view email]
[v1]
Tue, 3 Feb 2026 14:04:51 UTC (2,204 KB)
[v2]
Wed, 29 Apr 2026 03:49:13 UTC (2,204 KB)
