Loading…

Comparison of Different Machine Learning Approaches to Predict Small for Gestational Age Infants

Diagnosing infants who are small for gestational age (SGA) at early stages could help physicians to introduce interventions for SGA infants earlier. Machine learning (ML) is envisioned as a tool to identify SGA infants. However, ML has not been widely studied in this field. To develop effective SGA...

Full description

Saved in:
Bibliographic Details
Published in:IEEE transactions on big data 2020-06, Vol.6 (2), p.334-346
Main Authors: Li, Jianqiang, Liu, Lu, Sun, Jingchao, Mo, Haowen, Yang, Ji-Jiang, Chen, Shi, Liu, Huiting, Wang, Qing, Pan, Hui
Format: Article
Language:English
Subjects:
Citations: Items that this one cites
Items that cite this one
Online Access:Get full text
Tags: Add Tag
No Tags, Be the first to tag this record!
Description
Summary:Diagnosing infants who are small for gestational age (SGA) at early stages could help physicians to introduce interventions for SGA infants earlier. Machine learning (ML) is envisioned as a tool to identify SGA infants. However, ML has not been widely studied in this field. To develop effective SGA prediction models, we conducted four groups of experiments that considered basic ML methods, imbalanced data, feature selection and the time characteristics of variables, respectively. Infants with SGA data collected from 2010 to 2013 with gestational weeks between 24 and 42 were detected. Support vector machine (SVM), random forest (RF), logistic regression (LR) and Sparse LR models were trained on 10-fold cross validation. Precision and the area under the curve (AUC) of the receiver operator characteristic curve were evaluated. For each group, the performance of SVM and Sparse LR was similarly well. LR without any sparsity penalties performed worst, possibly caused by the overfitting problem. With the combination of handling imbalanced data and feature selection, the RF ensemble classifier performed best, which even obtained the highest AUC value (0.8547) with the help of expert knowledge. In other cases, RF performed worse than Sparse LR and SVM, possibly because of fully grown trees.
ISSN:2332-7790
2372-2096
DOI:10.1109/TBDATA.2016.2620981