Dual Hierarchical Dialogue Policy Learning for Legal Inquisitive Conversational Agents

View PDF HTML (experimental)

Abstract:Most existing dialogue systems are user-driven, primarily designed to fulfill user requests. However, in many critical real-world scenarios, a conversational agent must proactively extract information to achieve its own objectives rather than merely respond. To address this gap, we introduce \emph{Inquisitive Conversational Agents (ICAs)} and develop an ICA specifically tailored to U.S. Supreme Court oral arguments. We propose a Dual Hierarchical Reinforcement Learning framework featuring two cooperating RL agents, each with its own policy, to coordinate strategic dialogue management and fine-grained utterance generation. By learning when and how to ask probing questions, the agent emulates judicial questioning patterns and systematically uncovers crucial information to fulfill its legal objectives. Evaluations on a U.S. Supreme Court dataset show that our method outperforms various baselines across multiple metrics. It represents an important first step toward broader high-stakes, domain-specific applications.

Comments:	Accepted in ACL 2026 as Findings
Subjects:	Computation and Language (cs.CL)
Cite as:	arXiv:2605.14057 [cs.CL]
	(or arXiv:2605.14057v1 [cs.CL] for this version)
	https://doi.org/10.48550/arXiv.2605.14057 arXiv-issued DOI via DataCite (pending registration)

Submission history

From: Xubo Lin [view email]
[v1] Wed, 13 May 2026 19:29:11 UTC (1,982 KB)