Abstract:Early-stage Parkinson's disease (EarlyPD) detection from speech is clinically meaningful yet underexplored, and published results are hard to compare because studies differ in datasets, languages, tasks, evaluation protocols, and EarlyPD definitions. To address this issue, we propose the first benchmark for speech-based EarlyPD detection, with a speaker-independent split designed for fair and replicable cross-method evaluation on researcher-accessible datasets. The benchmark covers three common speech tasks and evaluates methods under different training-resource settings. We also present multi-dimensional evaluation breakdowns by dataset, aggregation level, gender, and disease stage to support fine-grained comparisons and clinical adoption. Our results provide a replicable reference and actionable insights, encouraging the adoption of this publicly available benchmark to advance robust and clinically meaningful EarlyPD detection from speech.
| Comments: | Submitted to Interspeech2026 |
| Subjects: | Audio and Speech Processing (eess.AS); Artificial Intelligence (cs.AI); Computation and Language (cs.CL); Sound (cs.SD) |
| Cite as: | arXiv:2605.14066 [eess.AS] |
| (or arXiv:2605.14066v1 [eess.AS] for this version) | |
| https://doi.org/10.48550/arXiv.2605.14066 arXiv-issued DOI via DataCite (pending registration) |
Submission history
From: Terry Yi Zhong [view email]
[v1]
Wed, 13 May 2026 19:43:01 UTC (47 KB)
