Loading…
Analysis of nailfold capillaroscopy images with artificial intelligence: Data from literature and performance of machine learning and deep learning from images acquired in the SCLEROCAP study
To evaluate the performance of machine learning and then deep learning to detect a systemic scleroderma (SSc) landscape from the same set of nailfold capillaroscopy (NC) images from the French prospective multicenter observational study SCLEROCAP. NC images from the first 100 SCLEROCAP patients were...
Saved in:
Published in: | Microvascular research 2025-01, Vol.157, p.104753, Article 104753 |
---|---|
Main Authors: | , , , , , , , |
Format: | Article |
Language: | English |
Subjects: | |
Citations: | Items that this one cites |
Online Access: | Get full text |
Tags: |
Add Tag
No Tags, Be the first to tag this record!
|
Summary: | To evaluate the performance of machine learning and then deep learning to detect a systemic scleroderma (SSc) landscape from the same set of nailfold capillaroscopy (NC) images from the French prospective multicenter observational study SCLEROCAP.
NC images from the first 100 SCLEROCAP patients were analyzed to assess the performance of machine learning and then deep learning in identifying the SSc landscape, the NC images having previously been independently and consensually labeled by expert clinicians. Images were divided into a training set (70 %) and a validation set (30 %). After features extraction from the NC images, we tested six classifiers (random forests (RF), support vector machine (SVM), logistic regression (LR), light gradient boosting (LGB), extreme gradient boosting (XGB), K-nearest neighbors (KNN)) on the training set with five different combinations of the images. The performance of each classifier was evaluated by the F1 score. In the deep learning section, we tested three pre-trained models from the TIMM library (ResNet-18, DenseNet-121 and VGG-16) on raw NC images after applying image augmentation methods.
With machine learning, performance ranged from 0.60 to 0.73 for each variable, with Hu and Haralick moments being the most discriminating. Performance was highest with the RF, LGB and XGB models (F1 scores: 0.75–0.79). The highest score was obtained by combining all variables and using the LGB model (F1 score: 0.79 ± 0.05, p |
---|---|
ISSN: | 0026-2862 1095-9319 1095-9319 |
DOI: | 10.1016/j.mvr.2024.104753 |