Loading…

Practical Training Approaches for Discordant Atopic Dermatitis Severity Datasets: Merging Methods With Soft-Label and Train-Set Pruning

Objective assessment of atopic dermatitis (AD) is essential for choosing proper management strategies. This study investigated the performance of convolutional neural networks (CNN) models in grading the severity of AD. Five board-certified dermatologists independently evaluated the severity of 9,19...

Full description

Saved in:

Bibliographic Details
Published in:	IEEE journal of biomedical and health informatics 2023-01, Vol.27 (1), p.166-175
Main Authors:	Cho, Soo Ick, Lee, Dongheon, Han, Byeol, Lee, Ji Su, Hong, Ji Yeon, Chung, Jin Ho, Lee, Dong Hun, Na, Jung-Im
Format:	Article
Language:	English
Subjects:	Artificial neural networks Atopic dermatitis Bioinformatics Biological system modeling Convolutional neural networks Datasets Dermatitis Dermatitis, Atopic Dermatology discordance Eczema Hospitals Humans Immunoglobulin A investigator's global assessment Merging Neural networks Neural Networks, Computer Pruning soft-label Training
Citations:	Items that this one cites Items that cite this one
Online Access:	Get full text
Tags:	Add Tag No Tags, Be the first to tag this record!

Description
Summary:	Objective assessment of atopic dermatitis (AD) is essential for choosing proper management strategies. This study investigated the performance of convolutional neural networks (CNN) models in grading the severity of AD. Five board-certified dermatologists independently evaluated the severity of 9,192 AD images. The severity of AD was evaluated based on an Investigator's Global Assessment (IGA) and six signs of AD. For CNN training, we applied three distinct approaches: 1) ensemble vs. integration 2) hard-label vs. soft-label and 3) train-set pruning. For the IGA prediction, the two best models were chosen based on the macro-averaged AUROC and F-1 score. The ensemble-soft-label-pruning model was chosen based on AUROC 0.943, 0.927 for the internal and external validation set respectively, and integration-soft-label-whole dataset model was chosen based on the F1-score 0.750, 0.721 for the internal and external validation set respectively. CNN models trained by multi-evaluator dataset outperformed the models by an individual evaluator dataset, and they performed better to the dataset in which the assessment of dermatologists was concordant. In conclusion, CNN models for AD could be improved by labeled dataset from multiple evaluators, merging methods with soft-label and train-set pruning.
ISSN:	2168-2194 2168-2208
DOI:	10.1109/JBHI.2022.3218166