Cognitive-Uncertainty Guided Knowledge Distillation for Accurate Classification of Student Misconceptions

View PDF HTML (experimental)

Abstract:Accurately identifying student misconceptions is crucial for personalized education but faces three challenges: (1) data scarcity with long-tail distribution, where authentic student reasoning is difficult to synthesize; (2) fuzzy boundaries between error categories with high annotation noise; (3) deployment parado-large models overlook unconventional approaches due to pretraining bias and cannot be deployed on edge, while small models overfit to noise. Unlike traditional methods that increase diversity through large-scale data synthesis, we propose a two-stage knowledge distillation framework that mines high-value samples from existing data. The first stage performs standard distillation to transfer task capabilities. The second stage introduces a dual-layer marginal selection mechanism based on cognitive uncertainty, identifying four types of critical samples based on teacher model uncertainty and confidence differences. For different data subsets, we design difficulty-adaptive mechanism to balance hard/soft label contributions, enabling student models to inherit inter-class relationships from teacher soft labels while distinguishing ambiguous error types. Experiments show that with augmented training on only 10.30% of filtered samples, we achieve MAP@3 of 0.9585 (+17.8%) on the MAP-Charting dataset, and using only a 4B parameter model, we attain 84.38% accuracy on cross-topic tests of middle school algebra misconception benchmarks, significantly outperforming sota LLM (67.73%) and standard fine-tuned 72B models (81.25%). Our code is available at this https URL.

Comments:	ACL 2026 Findings. 10 pages, 5 figures, 19 tables
Subjects:	Machine Learning (cs.LG); Artificial Intelligence (cs.AI)
MSC classes:	68T05, 97D70
ACM classes:	I.2.7; I.2.6; K.3.1
Cite as:	arXiv:2605.14752 [cs.LG]
	(or arXiv:2605.14752v1 [cs.LG] for this version)
	https://doi.org/10.48550/arXiv.2605.14752 arXiv-issued DOI via DataCite (pending registration)

Submission history

From: Qirui Liu [view email]
[v1] Thu, 14 May 2026 12:17:38 UTC (699 KB)