Abstract:The success of deep learning in high-dimensional settings is often attributed to the presence of low-dimensional structure in real-world data. While standard theoretical models typically assume that this structure lies in the target function, projecting unstructured inputs onto a low-dimensional subspace, data such as images, text or genomic sequences exhibit strong spatial correlations within the input space itself. In this paper, we propose a tractable model to study how these correlations affect the sample complexity of learning with gradient descent on shallow neural networks. Specifically, we consider targets that depend on a small number of latent Boolean variables, and input features grouped into clusters and correlated with the latent variables. Under an identifiability assumption, we show that for a layerwise gradient-descent variant, the sample complexity scales with the number of hidden variables and, when the signal-to-noise ratio is sufficiently high, is independent of the input dimension, up to logarithmic terms. We empirically test our theoretical findings on both synthetic and real data.
| Comments: | 10 pages main body, 2 figures |
| Subjects: | Machine Learning (cs.LG) |
| Cite as: | arXiv:2605.14927 [cs.LG] |
| (or arXiv:2605.14927v1 [cs.LG] for this version) | |
| https://doi.org/10.48550/arXiv.2605.14927 arXiv-issued DOI via DataCite (pending registration) |
Submission history
From: Elisabetta Cornacchia [view email]
[v1]
Thu, 14 May 2026 15:02:24 UTC (828 KB)
