Abstract:In this retrospective multi-institutional study, a quantitative phenotyping framework, CT-IDP (CT Image-Derived Phenotypes) was developed on the MERLIN abdominal CT benchmark (training, validation, and test sets- 15,175, 5,018, and 5,082 studies, respectively) and externally evaluated on two independent dataset: Duke-Abdomen (2,000) and AMOS (1,107). Multi-organ segmentations were generated with TotalSegmentator and used to derive over 900 organ and compartment-level descriptors spanning morphometry, attenuation, and contextual/burden findings. Sparse disease-specific logistic regression with elastic-net regularization was trained on MERLIN and externally validated under a frozen specification. Performance was compared against a DINOv3-based vision-transformer baseline using AUC and average precision (AP), supported by phenotype-stratified audits and coefficient-level inspection. Macro-AUC for CT-IDP versus the baseline was 0.897 versus 0.880 on MERLIN, 0.877 versus 0.857 on the Duke-Abdomen dataset, and 0.780 versus 0.756 on AMOS.
| Subjects: | Computer Vision and Pattern Recognition (cs.CV); Artificial Intelligence (cs.AI) |
| Cite as: | arXiv:2605.09002 [cs.CV] |
| (or arXiv:2605.09002v1 [cs.CV] for this version) | |
| https://doi.org/10.48550/arXiv.2605.09002 arXiv-issued DOI via DataCite (pending registration) |
Submission history
From: Lavsen Dahal [view email]
[v1]
Sat, 9 May 2026 15:39:02 UTC (1,497 KB)
