Loading…
Predicting the onset of type 2 diabetes using wide and deep learning with electronic health records
•Information from electronic health records can be used to help us understand what contributes to the onset of diseases including type 2 diabetes mellitus.•Machine learning, including deep learning, has been used to predict the onset of diseases using information from electronic health records.•Our...
Saved in:
Published in: | Computer methods and programs in biomedicine 2019-12, Vol.182, p.105055-105055, Article 105055 |
---|---|
Main Authors: | , , , , , , , |
Format: | Article |
Language: | English |
Subjects: | |
Citations: | Items that this one cites Items that cite this one |
Online Access: | Get full text |
Tags: |
Add Tag
No Tags, Be the first to tag this record!
|
Summary: | •Information from electronic health records can be used to help us understand what contributes to the onset of diseases including type 2 diabetes mellitus.•Machine learning, including deep learning, has been used to predict the onset of diseases using information from electronic health records.•Our work is the first to use wide and deep learning, a state-of-the-art deep learning architecture that achieves both memorisation and generalisation abilities, to predict the onset of type 2 diabetes mellitus using electronic health records.•Our algorithm is better at predicting the onset of type 2 diabetes mellitus than other state-of-the-art machine learning algorithms using the same dataset with similar experimental settings.•The synthetic minority over-sampling technique (SMOTE) was found to work better with the wide and deep learning framework than other machine learning algorithms in improving sensitivity for imbalanced electronic health record datasets.
Diabetes is responsible for considerable morbidity, healthcare utilisation and mortality in both developed and developing countries. Currently, methods of treating diabetes are inadequate and costly so prevention becomes an important step in reducing the burden of diabetes and its complications. Electronic health records (EHRs) for each individual or a population have become important tools in understanding developing trends of diseases. Using EHRs to predict the onset of diabetes could improve the quality and efficiency of medical care. In this paper, we apply a wide and deep learning model that combines the strength of a generalised linear model with various features and a deep feed-forward neural network to improve the prediction of the onset of type 2 diabetes mellitus (T2DM).
The proposed method was implemented by training various models into a logistic loss function using a stochastic gradient descent. We applied this model using public hospital record data provided by the Practice Fusion EHRs for the United States population. The dataset consists of de-identified electronic health records for 9948 patients, of which 1904 have been diagnosed with T2DM. Prediction of diabetes in 2012 was based on data obtained from previous years (2009–2011). The imbalance class of the model was handled by Synthetic Minority Oversampling Technique (SMOTE) for each cross-validation training fold to analyse the performance when synthetic examples for the minority class are created. We used SMOTE of 150 and 300 percent, in |
---|---|
ISSN: | 0169-2607 1872-7565 |
DOI: | 10.1016/j.cmpb.2019.105055 |