Loading…

Validation of deep learning-based computer-aided detection software use for interpretation of pulmonary abnormalities on chest radiographs and examination of factors that influence readers’ performance and final diagnosis

Purpose To evaluate the performance of a deep learning-based computer-aided detection (CAD) software for detecting pulmonary nodules, masses, and consolidation on chest radiographs (CRs) and to examine the effect of readers’ experience and data characteristics on the sensitivity and final diagnosis....

Full description

Saved in:
Bibliographic Details
Published in:Japanese journal of radiology 2023-01, Vol.41 (1), p.38-44
Main Authors: Toda, Naoki, Hashimoto, Masahiro, Iwabuchi, Yu, Nagasaka, Misa, Takeshita, Ryo, Yamada, Minoru, Yamada, Yoshitake, Jinzaki, Masahiro
Format: Article
Language:English
Subjects:
Citations: Items that this one cites
Items that cite this one
Online Access:Get full text
Tags: Add Tag
No Tags, Be the first to tag this record!
Description
Summary:Purpose To evaluate the performance of a deep learning-based computer-aided detection (CAD) software for detecting pulmonary nodules, masses, and consolidation on chest radiographs (CRs) and to examine the effect of readers’ experience and data characteristics on the sensitivity and final diagnosis. Materials and methods The CRs of 453 patients were retrospectively selected from two institutions. Among these CRs, 60 images with abnormal findings (pulmonary nodules, masses, and consolidation) and 140 without abnormal findings were randomly selected for sequential observer-performance testing. In the test, 12 readers (three radiologists, three pulmonologists, three non-pulmonology physicians, and three junior residents) interpreted 200 images with and without CAD, and the findings were compared. Weighted alternative free-response receiver operating characteristic (wAFROC) figure of merit (FOM) was used to analyze observer performance. The lesions that readers initially missed but CAD detected were stratified by anatomic location and degree of subtlety, and the adoption rate was calculated. Fisher’s exact test was used for comparison. Results The mean wAFROC FOM score of the 12 readers significantly improved from 0.746 to 0.810 with software assistance ( P  = 0.007). In the reader group with 
ISSN:1867-1071
1867-108X
DOI:10.1007/s11604-022-01330-w