Abstract:Causal discovery from i.i.d. observational data is known to be generally ill-posed. We demonstrate that if we have access to the distribution {induced} by a structural causal model, and additional data from (in the best case) \textit{only two} environments that sufficiently differ in the noise statistics, the unique causal graph is identifiable. Notably, this is the first result in the literature that guarantees the entire causal graph recovery with a constant number of environments and arbitrary nonlinear mechanisms. Our only constraint is the Gaussianity of the noise terms; however, we propose potential ways to relax this requirement. Of interest on its own, we expand on the well-known duality between independent component analysis (ICA) and causal discovery; recent advancements have shown that nonlinear ICA can be solved from multiple environments, at least as many as the number of sources: we show that the same can be achieved for causal discovery while having access to much less auxiliary information.
| Comments: | Published as ICLR 2026 conference paper |
| Subjects: | Machine Learning (stat.ML); Machine Learning (cs.LG) |
| Cite as: | arXiv:2510.13583 [stat.ML] |
| (or arXiv:2510.13583v4 [stat.ML] for this version) | |
| https://doi.org/10.48550/arXiv.2510.13583 arXiv-issued DOI via DataCite |
Submission history
From: Francesco Montagna [view email]
[v1]
Wed, 15 Oct 2025 14:16:21 UTC (101 KB)
[v2]
Tue, 2 Dec 2025 14:33:13 UTC (98 KB)
[v3]
Wed, 18 Mar 2026 13:03:46 UTC (113 KB)
[v4]
Thu, 14 May 2026 11:13:25 UTC (569 KB)
