ClinTutor-R1: Advancing Scalable and Robust One-to-Many Alignment in Clinical Socratic Education

View PDF

Abstract:While Large Language Models (LLMs) have achieved remarkable success in dyadic (one-on-one) instruction, they face significant challenges in One-to-Many alignment, such as clinical ward rounds, where an instructor must simultaneously guide a diverse group of trainees. Current models often suffer from context dilution and goal misalignment, failing to balance individual scaffolding with collective learning progress. To address this, we introduce ClinEdu, a multi-agent pedagogical simulator that models the complexity of group dynamics. Leveraging this platform, we construct ClinTeach, a large-scale dataset of Socratic teaching dialogues, and propose ClinTutor-R1, the first vision-language agent explicitly architected to achieve one-to-many alignment in clinical education, employing an explicit internal thinking mechanism to model both individual belief states and group consensus. We validate our framework through a comprehensive protocol covering static benchmarks, in-situ interactive evaluation within ClinEdu, expert assessment, and a 200-participant real user study. Experimental results demonstrate that ClinTutor-R1 outperforms base models by over 20% and achieves parity with proprietary models, while exhibiting scalability in maintaining instructional quality across expanding student cohorts.

Comments:	Accepted by ICML 2026 (Spotlight)
Subjects:	Computation and Language (cs.CL)
Cite as:	arXiv:2512.05671 [cs.CL]
	(or arXiv:2512.05671v2 [cs.CL] for this version)
	https://doi.org/10.48550/arXiv.2512.05671 arXiv-issued DOI via DataCite

Submission history

From: Zhitao He [view email]
[v1] Fri, 5 Dec 2025 12:28:30 UTC (2,213 KB)
[v2] Mon, 1 Jun 2026 03:43:57 UTC (2,512 KB)