Abstract:Preserving model fidelity is essential for stealthy text-to-image (T2I) backdoor attacks. Existing methods such as Learning without Forgetting (LwF) rely on output-based distillation, which provides limited regularization. We introduce Elastic Weight Consolidation (EWC) as a parameter-based alternative for preserving fidelity in backdoor learning. While stronger in principle, we show that standard static EWC with a fixed regularization weight lambda and mean-squared utility loss creates an artificial trade-off between attack success rate (ASR) and fidelity, particularly degrading performance on weak triggers. To address this, we propose Cosine-Aware Adaptive EWC, which dynamically adjusts EWC regularization using a cosine-based semantic utility and adaptive scheduling. This approach transforms EWC from a fixed penalty into a context-sensitive constraint, maintaining high ASR while preserving model fidelity. Experiments demonstrate improved ASR-fidelity balance and enhanced robustness on out-of-domain (OOD) datasets compared to existing baselines.
| Subjects: | Machine Learning (cs.LG); Artificial Intelligence (cs.AI) |
| Cite as: | arXiv:2605.08280 [cs.LG] |
| (or arXiv:2605.08280v1 [cs.LG] for this version) | |
| https://doi.org/10.48550/arXiv.2605.08280 arXiv-issued DOI via DataCite (pending registration) |
Submission history
From: Bowen Lu [view email]
[v1]
Fri, 8 May 2026 07:09:59 UTC (6,040 KB)
