Parametric Social Identity Injection and Diversification in Public Opinion Simulation

View PDF HTML (experimental)

Abstract:Large language models (LLMs) have recently been adopted as synthetic agents for public opinion simulation, offering a promising alternative to costly and slow human surveys. Despite their scalability, current LLM-based simulation methods fail to capture social diversity, producing flattened inter-group differences and overly homogeneous responses across demographic groups. We identify this limitation as a Diversity Collapse phenomenon in LLM hidden representations, where distinct social identities become increasingly indistinguishable across layers. Motivated by this observation, we propose Parametric Social Identity Injection (PSII), a general framework that injects explicit, parametric representations of demographic attributes and value orientations directly into intermediate hidden states of LLMs. Unlike prompt-based persona conditioning, PSII enables fine-grained and controllable identity modulation at the representation level. Extensive experiments on the World Values Survey using multiple open-source LLMs show that PSII significantly improves distributional fidelity and diversity, reducing KL divergence to real-world survey data while enhancing overall diversity. This work provides new insights into representation-level control of LLM agents and advances scalable, diversity-aware public opinion simulation.

Comments:	Accepted to KDD 2026 Research Track. Project page: this https URL
Subjects:	Computation and Language (cs.CL)
MSC classes:	68T50
ACM classes:	I.2.7
Cite as:	arXiv:2603.16142 [cs.CL]
	(or arXiv:2603.16142v2 [cs.CL] for this version)
	https://doi.org/10.48550/arXiv.2603.16142 arXiv-issued DOI via DataCite
Related DOI:	https://doi.org/10.1145/3770855.3817926 DOI(s) linking to related resources

Submission history

From: Hexi Wang [view email]
[v1] Tue, 17 Mar 2026 05:52:03 UTC (1,470 KB)
[v2] Mon, 1 Jun 2026 09:49:53 UTC (1,494 KB)