Loading…
An ensemble Machine Learning approach for predicting Type-II diabetes mellitus based on lifestyle indicators
Machine Learning (ML) is a branch of artificial intelligence that allows computers to learn without being explicitly programmed. ML has been widely used in healthcare to predict various chronic diseases. Prediction of diabetes at earlier stages is crucial for better clinical pathways to reduce the c...
Saved in:
Published in: | Healthcare analytics (New York, N.Y.) N.Y.), 2022-11, Vol.2, p.100092, Article 100092 |
---|---|
Main Authors: | , |
Format: | Article |
Language: | English |
Subjects: | |
Citations: | Items that this one cites Items that cite this one |
Online Access: | Get full text |
Tags: |
Add Tag
No Tags, Be the first to tag this record!
|
Summary: | Machine Learning (ML) is a branch of artificial intelligence that allows computers to learn without being explicitly programmed. ML has been widely used in healthcare to predict various chronic diseases. Prediction of diabetes at earlier stages is crucial for better clinical pathways to reduce the complications and delay the occurrence of diabetes. In this study, a new ensemble learning-based framework is proposed for the early predicting of Type-II diabetes mellitus using lifestyle indicators. Different ensemble learning techniques like Bagging, Boosting, and Voting are employed. Exploratory data analysis is used to improve the quality assessment of the dataset. The synthetic minority oversampling technique is used for class balancing, and the K-fold cross-validation technique is employed to validate the results. A feature engineering process is applied to calculate the contribution of lifestyle parameters. Among all the classification techniques, the bagged decision tree achieved the highest accuracy rate (99.41%), precision (99.13%), recall (95.83%), specificity (99.11%), F1-score (99.15%), misclassification rate (MCR) (0.86%), and receiver operating characteristic (ROC) curve (99.07%), respectively. The proposed framework can be used in the healthcare industry for the early prediction of diabetes. Also, it can be used for other datasets which share a commonality of data with diabetes.
•This paper presents a novel approach to predict type-II diabetes mellitus.•We collected real lifestyle data of 1939 patients from different demographic regions.•Perform exploratory data analysis to improve the quality assessment of dataset.•We used ensemble learning techniques (Bagging, Boosting, and Voting) for the prediction.•We achieved best accuracy rate of 99.14% using Bagged Decision Tree. |
---|---|
ISSN: | 2772-4425 2772-4425 |
DOI: | 10.1016/j.health.2022.100092 |