Loading…

An ensemble Machine Learning approach for predicting Type-II diabetes mellitus based on lifestyle indicators

Machine Learning (ML) is a branch of artificial intelligence that allows computers to learn without being explicitly programmed. ML has been widely used in healthcare to predict various chronic diseases. Prediction of diabetes at earlier stages is crucial for better clinical pathways to reduce the c...

Full description

Saved in:
Bibliographic Details
Published in:Healthcare analytics (New York, N.Y.) N.Y.), 2022-11, Vol.2, p.100092, Article 100092
Main Authors: Ganie, Shahid Mohammad, Malik, Majid Bashir
Format: Article
Language:English
Subjects:
Citations: Items that this one cites
Items that cite this one
Online Access:Get full text
Tags: Add Tag
No Tags, Be the first to tag this record!
Description
Summary:Machine Learning (ML) is a branch of artificial intelligence that allows computers to learn without being explicitly programmed. ML has been widely used in healthcare to predict various chronic diseases. Prediction of diabetes at earlier stages is crucial for better clinical pathways to reduce the complications and delay the occurrence of diabetes. In this study, a new ensemble learning-based framework is proposed for the early predicting of Type-II diabetes mellitus using lifestyle indicators. Different ensemble learning techniques like Bagging, Boosting, and Voting are employed. Exploratory data analysis is used to improve the quality assessment of the dataset. The synthetic minority oversampling technique is used for class balancing, and the K-fold cross-validation technique is employed to validate the results. A feature engineering process is applied to calculate the contribution of lifestyle parameters. Among all the classification techniques, the bagged decision tree achieved the highest accuracy rate (99.41%), precision (99.13%), recall (95.83%), specificity (99.11%), F1-score (99.15%), misclassification rate (MCR) (0.86%), and receiver operating characteristic (ROC) curve (99.07%), respectively. The proposed framework can be used in the healthcare industry for the early prediction of diabetes. Also, it can be used for other datasets which share a commonality of data with diabetes. •This paper presents a novel approach to predict type-II diabetes mellitus.•We collected real lifestyle data of 1939 patients from different demographic regions.•Perform exploratory data analysis to improve the quality assessment of dataset.•We used ensemble learning techniques (Bagging, Boosting, and Voting) for the prediction.•We achieved best accuracy rate of 99.14% using Bagged Decision Tree.
ISSN:2772-4425
2772-4425
DOI:10.1016/j.health.2022.100092