Loading…
Comparison of diagnostic accuracy and utility of artificial intelligence–optimized ACR TI-RADS and original ACR TI-RADS: a multi-center validation study based on 2061 thyroid nodules
Objective To determine if artificial intelligence–based modification of the Thyroid Imaging Reporting Data System (TI-RADS) would be better than the current American College of Radiology (ACR) TI-RADS for risk stratification of thyroid nodules. Methods A total of 2061 thyroid nodules (in 1859 patien...
Saved in:
Published in: | European radiology 2022-11, Vol.32 (11), p.7733-7742 |
---|---|
Main Authors: | , , , , , , , , , |
Format: | Article |
Language: | English |
Subjects: | |
Citations: | Items that this one cites Items that cite this one |
Online Access: | Get full text |
Tags: |
Add Tag
No Tags, Be the first to tag this record!
|
Summary: | Objective
To determine if artificial intelligence–based modification of the Thyroid Imaging Reporting Data System (TI-RADS) would be better than the current American College of Radiology (ACR) TI-RADS for risk stratification of thyroid nodules.
Methods
A total of 2061 thyroid nodules (in 1859 patients) sampled with fine-needle aspiration or operation were retrospectively analyzed between January 2017 and July 2020. Two radiologists blinded to the pathologic diagnosis evaluated nodule features in five ultrasound categories and assigned TI-RADS scores by both ACR TI-RADS and AI TI-RADS. Inter-rater agreement was assessed by asking another two radiologists to score a set of 100 nodules independently. The reference standard was postoperative pathological or cytopathological diagnosis according to the Bethesda system. Inter-rater agreement was determined using intraclass correlation coefficient (ICC).
Results
AI TI-RADS assigned lower TI-RADS risk levels than ACR TI-RADS (
p
< 0.001) and had larger area under receiver operating characteristic curve (0.762 vs. 0.679,
p
< 0.001). The sensitivities of ACR TI-RADS and AI TI-RADS were similar (86.7% vs. 82.2%,
p
= 0.052), but specificity was higher with AI TI-RADS (70.2% vs. 49.2%,
p
< 0.001). AI TI-RADS downgraded 743 (48.63%) benign nodules, indicating that 328 (42.3% of 776 biopsied nodules) unnecessary fine-needle aspirations (FNA) could have been avoided. Inter-rater agreement was better with AI TI-RADS than with ACR TI-RADS (ICC, 0.808 vs. 0.861,
p
< 0.001).
Conclusion
AI TI-RADS can achieve meaningful reduction in the number of benign thyroid nodules recommended for biopsy and significantly improve specificity despite a slight decrease in sensitivity.
Key Points
• AI TI-RADS assigned lower TI-RADS risk levels than ACR TI-RADS, showing similar sensitivity but higher specificity.
• Half of the benign nodules can be downgraded of which 42.3% of biopsy nodules avoided unnecessary fine-needle aspiration (FNA).
• AI TI-RADS had a better overall inter-rater agreement. |
---|---|
ISSN: | 1432-1084 0938-7994 1432-1084 |
DOI: | 10.1007/s00330-022-08827-y |