Abstract:Modern retrieval-augmented generation (RAG) systems convert sensitive content into high-dimensional embeddings and store them in vector databases that treat the resulting numerical artifacts as opaque. Major vector-store products do not provide native controls for embedding integrity, ingestion-time distributional anomaly detection, or cryptographic provenance attestation. We show this opens a class of steganographic exfiltration attacks: an attacker with write access to the ingestion pipeline can hide payload data inside embeddings using simple post-embedding perturbations (noise injection, rotation, scaling, offset, fragmentation, and combinations thereof) while preserving the surface-level retrieval behavior the RAG system exposes to legitimate users.
We evaluate these techniques across a synthetic-PII corpus on text-embedding-3-large, four locally hosted open embedding models, a cross-corpus replication on BEIR NFCorpus and a Quora subset (over 26,000 chunks combined), seven vector-store configurations, an adaptive-attacker variant of the detector evaluation, and a paraphrased-query retrieval benchmark. Distribution-shifting perturbations are often caught by simple anomaly detectors; small-angle orthogonal rotation defeats distribution-based detection across every (model, corpus) pair tested. A disjoint-Givens rotation encoder gives a closed-form per-vector capacity ceiling of floor(d/2) * b bits, but real embedding manifolds impose a capacity-detectability trade-off, and the retrieval-preserving operating point sits well below it.
We propose VectorPin, a cryptographic provenance protocol that pins each embedding to its source content and producing model via an Ed25519 signature over a canonical byte representation. Any post-embedding modification breaks signature verification. Embedding-level integrity is a deployable, standardizable control that closes this attack class.
| Comments: | 47 pages, 3 figures. Reference implementations: this https URL and this https URL |
| Subjects: | Cryptography and Security (cs.CR); Information Retrieval (cs.IR); Machine Learning (cs.LG) |
| ACM classes: | K.6.5; I.2.7; H.3.3 |
| Cite as: | arXiv:2605.13764 [cs.CR] |
| (or arXiv:2605.13764v1 [cs.CR] for this version) | |
| https://doi.org/10.48550/arXiv.2605.13764 arXiv-issued DOI via DataCite (pending registration) |
|
| Related DOI: | https://doi.org/10.5281/zenodo.20076420
DOI(s) linking to related resources |
Submission history
From: Jascha Wanger [view email]
[v1]
Wed, 13 May 2026 16:44:20 UTC (223 KB)
