Loading…

Impact of Different Mammography Systems on Artificial Intelligence Performance in Breast Cancer Screening

Artificial intelligence (AI) tools may assist breast screening mammography programs, but limited evidence supports their generalizability to new settings. This retrospective study used a 3-year dataset (April 1, 2016-March 31, 2019) from a U.K. regional screening program. The performance of a commer...

Full description

Saved in:
Bibliographic Details
Published in:Radiology. Artificial intelligence 2023-05, Vol.5 (3), p.e220146-e220146
Main Authors: de Vries, Clarisse F, Colosimo, Samantha J, Staff, Roger T, Dymiter, Jaroslaw A, Yearsley, Joseph, Dinneen, Deirdre, Boyle, Moragh, Harrison, David J, Anderson, Lesley A, Lip, Gerald
Format: Article
Language:English
Subjects:
Citations: Items that this one cites
Items that cite this one
Online Access:Get full text
Tags: Add Tag
No Tags, Be the first to tag this record!
Description
Summary:Artificial intelligence (AI) tools may assist breast screening mammography programs, but limited evidence supports their generalizability to new settings. This retrospective study used a 3-year dataset (April 1, 2016-March 31, 2019) from a U.K. regional screening program. The performance of a commercially available breast screening AI algorithm was assessed with a prespecified and site-specific decision threshold to evaluate whether its performance was transferable to a new clinical site. The dataset consisted of women (aged approximately 50-70 years) who attended routine screening, excluding self-referrals, those with complex physical requirements, those who had undergone a previous mastectomy, and those who underwent screening that had technical recalls or did not have the four standard image views. In total, 55 916 screening attendees (mean age, 60 years ± 6 [SD]) met the inclusion criteria. The prespecified threshold resulted in high recall rates (48.3%, 21 929 of 45 444), which reduced to 13.0% (5896 of 45 444) following threshold calibration, closer to the observed service level (5.0%, 2774 of 55 916). Recall rates also increased approximately threefold following a software upgrade on the mammography equipment, requiring per-software version thresholds. Using software-specific thresholds, the AI algorithm would have recalled 277 of 303 (91.4%) screen-detected cancers and 47 of 138 (34.1%) interval cancers. AI performance and thresholds should be validated for new clinical settings before deployment, while quality assurance systems should monitor AI performance for consistency. Breast, Screening, Mammography, Computer Applications-Detection/Diagnosis, Neoplasms-Primary, Technology Assessment © RSNA, 2023.
ISSN:2638-6100
2638-6100
DOI:10.1148/ryai.220146