Loading…

Automated identification of uncertain cases in deep learning-based classification of dopamine transporter SPECT to improve clinical utility and acceptance

Purpose Deep convolutional neural networks (CNN) are promising for automatic classification of dopamine transporter (DAT)-SPECT images. Reporting the certainty of CNN-based decisions is highly desired to flag cases that might be misclassified and, therefore, require particularly careful inspection b...

Full description

Saved in:
Bibliographic Details
Published in:European journal of nuclear medicine and molecular imaging 2024-04, Vol.51 (5), p.1333-1344
Main Authors: Budenkotte, Thomas, Apostolova, Ivayla, Opfer, Roland, Krüger, Julia, Klutmann, Susanne, Buchert, Ralph
Format: Article
Language:English
Subjects:
Citations: Items that this one cites
Online Access:Get full text
Tags: Add Tag
No Tags, Be the first to tag this record!
Description
Summary:Purpose Deep convolutional neural networks (CNN) are promising for automatic classification of dopamine transporter (DAT)-SPECT images. Reporting the certainty of CNN-based decisions is highly desired to flag cases that might be misclassified and, therefore, require particularly careful inspection by the user. The aim of the current study was to design and validate a CNN-based system for the identification of uncertain cases. Methods A network ensemble (NE) combining five CNNs was trained for binary classification of [ 123 I]FP-CIT DAT-SPECT images as “normal” or “neurodegeneration-typical reduction” with high accuracy (NE for classification, NEfC). An uncertainty detection module (UDM) was obtained by combining two additional NE, one trained for detection of “reduced” DAT-SPECT with high sensitivity, the other with high specificity. A case was considered “uncertain” if the “high sensitivity” NE and the “high specificity” NE disagreed. An internal “development” dataset of 1740 clinical DAT-SPECT images was used for training ( n  = 1250) and testing ( n  = 490). Two independent datasets with different image characteristics were used for testing only ( n  = 640, 645). Three established approaches for uncertainty detection were used for comparison (sigmoid, dropout, model averaging). Results In the test data from the development dataset, the NEfC achieved 98.0% accuracy. 4.3% of all test cases were flagged as “uncertain” by the UDM: 2.5% of the correctly classified cases and 90% of the misclassified cases. NEfC accuracy among “certain” cases was 99.8%. The three comparison methods were less effective in labelling misclassified cases as “uncertain” (40–80%). These findings were confirmed in both additional test datasets. Conclusion The UDM allows reliable identification of uncertain [ 123 I]FP-CIT SPECT with high risk of misclassification. We recommend that automatic classification of [ 123 I]FP-CIT SPECT images is combined with an UDM to improve clinical utility and acceptance. The proposed UDM method (“high sensitivity versus high specificity”) might be useful also for DAT imaging with other ligands and for other binary classification tasks.
ISSN:1619-7070
1619-7089
1619-7089
DOI:10.1007/s00259-023-06566-w