TalkTag: Fine-Grained Morphosyntactic Error Annotation for Transcribed Speech

View PDF HTML (experimental)

Abstract:Fine-grained morphosyntactic error annotation is important in clinical and developmental language research, yet it is labour-intensive, expert-dependent, and difficult to scale. We present TalkTag, an LLM-based lightweight tool fine-tuned to automate CHAT-style error annotation in spoken-language transcripts. Developed under conditions of extreme data scarcity using children's narrative data, the system shows the feasibility of linguistic analysis in low-resource settings. Our evaluation demonstrates that TalkTag produces encouragingly precise annotation while effectively identifying instances where linguistic ambiguity makes automated tagging genuinely complex. In summary, with TalkTag, we provide a scalable alternative to manual error annotation and practically viable support for morphosyntactic error annotation.

Subjects:	Computation and Language (cs.CL)
Cite as:	arXiv:2606.01820 [cs.CL]
	(or arXiv:2606.01820v1 [cs.CL] for this version)
	https://doi.org/10.48550/arXiv.2606.01820 arXiv-issued DOI via DataCite (pending registration)

Submission history

From: Shamira Venturini [view email]
[v1] Mon, 1 Jun 2026 07:34:24 UTC (44 KB)