Abstract:Diffusion models depend on pseudo-random number generators (PRNGs) for latent noise sampling. We present DiffusionHijack, a supply-chain backdoor attack that hijacks the PRNG to deterministically control generated images. A malicious PRNG, injected via compromised packages, forces pixel-perfect reproduction of attacker-chosen content (SSIM = 1.00, N = 100 trials) on Stable Diffusion v1.4, v1.5, and SDXL -- without modifying model weights. The attack is inherently undetectable by existing model auditing and content moderation mechanisms, as it operates entirely outside the neural network computation graph. The attack remains effective under stochastic sampling (eta > 0), bypasses CLIP-based safety checkers (98-100% success), and operates independently of the user's prompt. As a countermeasure, we replace the PRNG with a quantum random number generator (QRNG), which provides information-theoretic unpredictability. Across N = 100 prompt-model combinations, QRNG defense completely neutralizes the attack, reducing output similarity to random baseline levels (SSIM < 0.20 for SD 1.x models, < 0.45 for SDXL). This work exposes a previously overlooked supply-chain vulnerability and offers a hardware-level fundamental mitigation for generative AI systems.
| Comments: | This work has been submitted to the IEEE for possible publication |
| Subjects: | Cryptography and Security (cs.CR); Machine Learning (cs.LG) |
| Cite as: | arXiv:2605.13115 [cs.CR] |
| (or arXiv:2605.13115v1 [cs.CR] for this version) | |
| https://doi.org/10.48550/arXiv.2605.13115 arXiv-issued DOI via DataCite (pending registration) |
Submission history
From: Ziyang You [view email]
[v1]
Wed, 13 May 2026 07:34:04 UTC (7,026 KB)
