Quantum Advantage in Multi Agent Reinforcement Learning

View PDF HTML (experimental)

Abstract:We present an empirical evaluation of quantum entanglement in agent coordination within quantum multi agent reinforcement learning (QMARL). While QMARL has attracted growing interest recently, most prior work evaluates quantum policies without provable baselines, making it impossible to rigorously distinguish quantum advantage from algorithmic coincidence. We address this directly by evaluating a decentralized QMARL framework with variational quantum circuit (VQC) actors with shared entangled states. In the CHSH game, which has a mathematically proven classical performance ceiling of 0.75 win rate, we show that entangled QMARL agents approach the Tsirelson limit of 0.854, providing clear evidence of their quantum advantage. We show that unentangled quantum circuits match the classical baseline, confirming that entanglement and not the quantum circuit itself is the active coordination mechanism. We also explore the effect of specific entanglement structures, as some Bell states enable coordination gains while others actively harm performance. On cooperative navigation (CoopNav), QMARL without entanglement achieves $\sim2\times$ improvement in success rate over classical MAA2C ($\sim$0.85 versus $\sim$0.40), with the hybrid configuration, quantum actor paired with a classical centralised critic, outperforming both fully classical and fully quantum solutions. We present our experimental analysis and discuss future work.

Comments:	19 pages
Subjects:	Machine Learning (cs.LG); Multiagent Systems (cs.MA); Quantum Physics (quant-ph)
Cite as:	arXiv:2605.14235 [cs.LG]
	(or arXiv:2605.14235v1 [cs.LG] for this version)
	https://doi.org/10.48550/arXiv.2605.14235 arXiv-issued DOI via DataCite (pending registration)

Submission history

From: Simranjeet Singh Dahia [view email]
[v1] Thu, 14 May 2026 01:03:41 UTC (4,483 KB)