Loading…

An International Non-Inferiority Study for the Benchmarking of AI for Routine Radiology Cases: Chest X-ray, Fluorography and Mammography

An international reader study was conducted to gauge an average diagnostic accuracy of radiologists interpreting chest X-ray images, including those from fluorography and mammography, and establish requirements for stand-alone radiological artificial intelligence (AI) models. The retrospective studi...

Full description

Saved in:
Bibliographic Details
Published in:Healthcare (Basel) 2023-06, Vol.11 (12), p.1684
Main Authors: Arzamasov, Kirill, Vasilev, Yuriy, Vladzymyrskyy, Anton, Omelyanskaya, Olga, Shulkin, Igor, Kozikhina, Darya, Goncharova, Inna, Gelezhe, Pavel, Kirpichev, Yury, Bobrovskaya, Tatiana, Andreychenko, Anna
Format: Article
Language:English
Subjects:
Citations: Items that this one cites
Online Access:Get full text
Tags: Add Tag
No Tags, Be the first to tag this record!
Description
Summary:An international reader study was conducted to gauge an average diagnostic accuracy of radiologists interpreting chest X-ray images, including those from fluorography and mammography, and establish requirements for stand-alone radiological artificial intelligence (AI) models. The retrospective studies in the datasets were labelled as containing or not containing target pathological findings based on a consensus of two experienced radiologists, and the results of a laboratory test and follow-up examination, where applicable. A total of 204 radiologists from 11 countries with various experience performed an assessment of the dataset with a 5-point Likert scale via a web platform. Eight commercial radiological AI models analyzed the same dataset. The AI AUROC was 0.87 (95% CI:0.83-0.9) versus 0.96 (95% CI 0.94-0.97) for radiologists. The sensitivity and specificity of AI versus radiologists were 0.71 (95% CI 0.64-0.78) versus 0.91 (95% CI 0.86-0.95) and 0.93 (95% CI 0.89-0.96) versus 0.9 (95% CI 0.85-0.94) for AI. The overall diagnostic accuracy of radiologists was superior to AI for chest X-ray and mammography. However, the accuracy of AI was noninferior to the least experienced radiologists for mammography and fluorography, and to all radiologists for chest X-ray. Therefore, an AI-based first reading could be recommended to reduce the workload burden of radiologists for the most common radiological studies such as chest X-ray and mammography.
ISSN:2227-9032
2227-9032
DOI:10.3390/healthcare11121684