Abstract:Differentially private optimization suffers from a fundamental geometric mismatch: deep networks have highly anisotropic loss landscapes, yet DP-SGD injects isotropic noise. Second-order preconditioning can resolve this, but estimating curvature typically requires private data (consuming privacy budget) or public data (introducing distribution shift). We show that the Fisher Information Matrix decouples into architectural sensitivity, recoverable via synthetic noise, and input correlations, approximable from modality-specific frequency statistics. We propose DP-KFC, which constructs KFAC preconditioners by probing networks with structured synthetic noise, requiring neither private nor public data. Empirically, DP-KFC consistently outperforms DP-SGD and adaptive baselines across diverse modalities in strong privacy regimes ($\varepsilon \leq 3$). DP-KFC matches private-data preconditioners while public-data variants degrade by up to $4.8\%$, showing that curvature can be estimated without consuming privacy budget or introducing distribution shift. This enables privacy-preserving learning in specialized domains (e.g., medical applications) where regulatory constraints make data scarce.
| Comments: | Accepted at the International Conference on Machine Learning (ICML 2026). 9 pages main text + appendix, 5 figures, 2 tables. Code: this https URL |
| Subjects: | Machine Learning (cs.LG) |
| Cite as: | arXiv:2605.13418 [cs.LG] |
| (or arXiv:2605.13418v1 [cs.LG] for this version) | |
| https://doi.org/10.48550/arXiv.2605.13418 arXiv-issued DOI via DataCite (pending registration) |
Submission history
From: Marc Molina Van Den Bosch [view email]
[v1]
Wed, 13 May 2026 12:14:00 UTC (931 KB)
