Loading…

Testing the performance, adequacy, and applicability of an artificial intelligence model for pediatric pneumonia diagnosis

•Extending the classic validation workflow to deal with imperfect ground truth by using Bayesian latent class models (BLCA) to estimate accuracy.•Extending the classic validation workflow to assess the applicability and acceptance of a deep learning model by employing explainable AI (XAI) methods to...

Full description

Saved in:

Bibliographic Details
Published in:	Computer methods and programs in biomedicine 2023-12, Vol.242, p.107765-107765, Article 107765
Main Authors:	Domínguez-Rodríguez, Sara, Liz-López, Helena, Panizo-LLedot, Angel, Ballesteros, Álvaro, Dagan, Ron, Greenberg, David, Gutiérrez, Lourdes, Rojo, Pablo, Otheo, Enrique, Galán, Juan Carlos, Villanueva, Sara, García, Sonsoles, Mosquera, Pablo, Tagarro, Alfredo, Moraleda, Cinta, Camacho, David
Format:	Article
Language:	English
Subjects:	Chest X-ray CNNs Deep-learning Pneumonia
Citations:	Items that this one cites Items that cite this one
Online Access:	Get full text
Tags:	Add Tag No Tags, Be the first to tag this record!

Description
Summary:	•Extending the classic validation workflow to deal with imperfect ground truth by using Bayesian latent class models (BLCA) to estimate accuracy.•Extending the classic validation workflow to assess the applicability and acceptance of a deep learning model by employing explainable AI (XAI) methods to involve physicians and evaluate model aceptance and usefulness in routine clinical decisions.•In depth study of an System Decision Tool (SDT) specifically tailored to pediatric CAP. Community-acquired Pneumonia (CAP) is a common childhood infectious disease. Deep learning models show promise in X-ray interpretation and diagnosis, but their validation should be extended due to limitations in the current validation workflow. To extend the standard validation workflow we propose doing a pilot test with the next characteristics. First, the assumption of perfect ground truth (100% sensitive and specific) is unrealistic, as high intra and inter-observer variability have been reported. To address this, we propose using Bayesian latent class models (BLCA) to estimate accuracy during the pilot. Additionally, assessing only the performance of a model without considering its applicability and acceptance by physicians is insufficient if we hope to integrate AI systems into day-to-day clinical practice. Therefore, we propose employing explainable artificial intelligence (XAI) methods during the pilot test to involve physicians and evaluate how well a Deep Learning model is accepted and how helpful it is for routine decisions as well as analyze its limitations by assessing the etiology. This study aims to apply the proposed pilot to test a deep Convolutional Neural Network (CNN)-based model for identifying consolidation in pediatric chest-X-ray (CXR) images already validated using the standard workflow. For the standard validation workflow, a total of 5856 public CXRs and 950 private CXRs were used to train and validate the performance of the CNN model. The performance of the model was estimated assuming a perfect ground truth. For the pilot test proposed in this article, a total of 190 pediatric chest-X-ray (CXRs) images were used to test the CNN model support decision tool (SDT). The performance of the model on the pilot test was estimated using extensions of the two-test Bayesian Latent-Class model (BLCA). The sensitivity, specificity, and accuracy of the model were also assessed. The clinical characteristics of the patients were compared according to the model performanc
ISSN:	0169-2607 1872-7565
DOI:	10.1016/j.cmpb.2023.107765