Bias In, Bias Out? Finding Unbiased Subnetworks in Vanilla Models

View PDF HTML (experimental)

Abstract:The issue of algorithmic biases in deep learning has led to the development of various debiasing techniques, many of which perform complex training procedures or dataset manipulation. However, an intriguing question arises: is it possible to extract fair and bias-agnostic subnetworks from standard vanilla-trained models without relying on additional data, such as unbiased training set? In this work, we introduce Bias-Invariant Subnetwork Extraction (BISE), a learning strategy that identifies and isolates "bias-free" subnetworks that already exist within conventionally trained models, without retraining or finetuning the original parameters. Our approach demonstrates that such subnetworks can be extracted via pruning and can operate without modification, effectively relying less on biased features and maintaining robust performance. Our findings contribute towards efficient bias mitigation through structural adaptation of pre-trained neural networks via parameter removal, as opposed to costly strategies that are either data-centric or involve (re)training all model parameters. Extensive experiments on common benchmarks show the advantages of our approach in terms of the performance and computational efficiency of the resulting debiased model.

Comments:	This work has been accepted for publication at the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) 2026
Subjects:	Machine Learning (cs.LG); Computer Vision and Pattern Recognition (cs.CV)
Cite as:	arXiv:2603.05582 [cs.LG]
	(or arXiv:2603.05582v2 [cs.LG] for this version)
	https://doi.org/10.48550/arXiv.2603.05582 arXiv-issued DOI via DataCite

Submission history

From: Ivan Luiz De Moura Matos [view email]
[v1] Thu, 5 Mar 2026 18:54:24 UTC (4,058 KB)
[v2] Tue, 12 May 2026 21:44:20 UTC (3,945 KB)