Loading…
Voting Classification-Based Diabetes Mellitus Prediction Using Hypertuned Machine-Learning Techniques
Diabetes mellitus is a hyperglycemia-like chronic condition that is a troublesome disease. It is estimated that, according to the growing morbidity, by 2040, the world will cross 642 million diabetic patients. This means that each one of the ten adults will be diabetes-affected. Diabetes can also le...
Saved in:
Published in: | Mobile information systems 2022-03, Vol.2022, p.1-16 |
---|---|
Main Authors: | , , , , , |
Format: | Article |
Language: | English |
Subjects: | |
Citations: | Items that this one cites Items that cite this one |
Online Access: | Get full text |
Tags: |
Add Tag
No Tags, Be the first to tag this record!
|
cited_by | cdi_FETCH-LOGICAL-c337t-856408581c2a8a292a8cc0cf934ac0c8dce7248f350bb7e1b263704a25f9e05d3 |
---|---|
cites | cdi_FETCH-LOGICAL-c337t-856408581c2a8a292a8cc0cf934ac0c8dce7248f350bb7e1b263704a25f9e05d3 |
container_end_page | 16 |
container_issue | |
container_start_page | 1 |
container_title | Mobile information systems |
container_volume | 2022 |
creator | Mushtaq, Zaigham Ramzan, Muhammad Farhan Ali, Sikandar Baseer, Samad Samad, Ali Husnain, Mujtaba |
description | Diabetes mellitus is a hyperglycemia-like chronic condition that is a troublesome disease. It is estimated that, according to the growing morbidity, by 2040, the world will cross 642 million diabetic patients. This means that each one of the ten adults will be diabetes-affected. Diabetes can also lead to other illnesses such as heart attacks, kidney damage, and even blindness. The prediction of diabetes in advance motivates us to develop a machine learning-based model. A dataset was obtained from the online repository for this work. The obtained dataset was imbalanced. An imbalanced dataset presents a challenge that is needed to be balanced for prediction using multiple machine learning like Tomek and SMOTE. These techniques remove necessary outliers that are incomplete in the provided dataset. These outliers are also managed using the IQR method. Additionally, this research employed a two-stage model selection methodology. In the first stage, logistic regression, Support Vector Machine, k-nearest neighbors, gradient boost, Naive Bayes, and Random Forests were applied to determine the efficiency of prediction based on patients’ preconditioning. At this stage, Random Forest was found to be the best with an accuracy of 80.7% after applying SMOTE oversampling technique to balance the dataset. In the second stage, three better-performing models were used by utilizing a voting algorithm. The results were encouraging, and the model obtained 82.0% accuracy with the default dataset and 81.7% accuracy with the balanced dataset. Naive Bayes Theorem, Gradient Boosting Classifier, and Random Forest were used as inputs to the voting algorithm. |
doi_str_mv | 10.1155/2022/6521532 |
format | article |
fullrecord | <record><control><sourceid>proquest_cross</sourceid><recordid>TN_cdi_proquest_journals_2643818447</recordid><sourceformat>XML</sourceformat><sourcesystem>PC</sourcesystem><sourcerecordid>2643818447</sourcerecordid><originalsourceid>FETCH-LOGICAL-c337t-856408581c2a8a292a8cc0cf934ac0c8dce7248f350bb7e1b263704a25f9e05d3</originalsourceid><addsrcrecordid>eNp9kMtOwzAQRS0EEqWw4wMisYRQP2NnCeVRpFawaFF3keNMqKuQFNsR6t_jqF2zmTvSnJmruQhdE3xPiBATiimdZIISwegJGhElRZpjsT6NvZA8xUSuz9GF91uMM8yEHCH47IJtv5Jpo723tTU62K5NH7WHKnmyuoQAPllA09jQ--TDQWXNgCQrP-zN9jtwoW8jvdBmY1tI56BdO8yWYDat_enBX6KzWjcero46RquX5-V0ls7fX9-mD_PUMCZDqkTGsRKKGKqVpnmsxmBT54zrqKoyIClXNRO4LCWQkmZMYq6pqHPAomJjdHO4u3Pd4BuKbde7NloWNONMEcW5jNTdgTKu895BXeyc_dZuXxBcDEEWQ5DFMciI3x7w-F2lf-3_9B8_b3Mn</addsrcrecordid><sourcetype>Aggregation Database</sourcetype><iscdi>true</iscdi><recordtype>article</recordtype><pqid>2643818447</pqid></control><display><type>article</type><title>Voting Classification-Based Diabetes Mellitus Prediction Using Hypertuned Machine-Learning Techniques</title><source>Wiley_OA刊</source><creator>Mushtaq, Zaigham ; Ramzan, Muhammad Farhan ; Ali, Sikandar ; Baseer, Samad ; Samad, Ali ; Husnain, Mujtaba</creator><contributor>Farouk, Ahmed ; Ahmed Farouk</contributor><creatorcontrib>Mushtaq, Zaigham ; Ramzan, Muhammad Farhan ; Ali, Sikandar ; Baseer, Samad ; Samad, Ali ; Husnain, Mujtaba ; Farouk, Ahmed ; Ahmed Farouk</creatorcontrib><description>Diabetes mellitus is a hyperglycemia-like chronic condition that is a troublesome disease. It is estimated that, according to the growing morbidity, by 2040, the world will cross 642 million diabetic patients. This means that each one of the ten adults will be diabetes-affected. Diabetes can also lead to other illnesses such as heart attacks, kidney damage, and even blindness. The prediction of diabetes in advance motivates us to develop a machine learning-based model. A dataset was obtained from the online repository for this work. The obtained dataset was imbalanced. An imbalanced dataset presents a challenge that is needed to be balanced for prediction using multiple machine learning like Tomek and SMOTE. These techniques remove necessary outliers that are incomplete in the provided dataset. These outliers are also managed using the IQR method. Additionally, this research employed a two-stage model selection methodology. In the first stage, logistic regression, Support Vector Machine, k-nearest neighbors, gradient boost, Naive Bayes, and Random Forests were applied to determine the efficiency of prediction based on patients’ preconditioning. At this stage, Random Forest was found to be the best with an accuracy of 80.7% after applying SMOTE oversampling technique to balance the dataset. In the second stage, three better-performing models were used by utilizing a voting algorithm. The results were encouraging, and the model obtained 82.0% accuracy with the default dataset and 81.7% accuracy with the balanced dataset. Naive Bayes Theorem, Gradient Boosting Classifier, and Random Forest were used as inputs to the voting algorithm.</description><identifier>ISSN: 1574-017X</identifier><identifier>EISSN: 1875-905X</identifier><identifier>DOI: 10.1155/2022/6521532</identifier><language>eng</language><publisher>Amsterdam: Hindawi</publisher><subject>Accuracy ; Algorithms ; Bayes Theorem ; Blindness ; Body mass index ; Classification ; Datasets ; Diabetes ; Diabetes mellitus ; Disease ; Electronic health records ; Glucose ; Health care ; Hyperglycemia ; Insulin ; Machine learning ; Medical research ; Outliers (statistics) ; Oversampling ; Preconditioning ; Support vector machines ; Voting</subject><ispartof>Mobile information systems, 2022-03, Vol.2022, p.1-16</ispartof><rights>Copyright © 2022 Zaigham Mushtaq et al.</rights><rights>Copyright © 2022 Zaigham Mushtaq et al. This is an open access article distributed under the Creative Commons Attribution License (the “License”), which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited. Notwithstanding the ProQuest Terms and Conditions, you may use this content in accordance with the terms of the License. https://creativecommons.org/licenses/by/4.0</rights><lds50>peer_reviewed</lds50><oa>free_for_read</oa><woscitedreferencessubscribed>false</woscitedreferencessubscribed><citedby>FETCH-LOGICAL-c337t-856408581c2a8a292a8cc0cf934ac0c8dce7248f350bb7e1b263704a25f9e05d3</citedby><cites>FETCH-LOGICAL-c337t-856408581c2a8a292a8cc0cf934ac0c8dce7248f350bb7e1b263704a25f9e05d3</cites><orcidid>0000-0002-3754-3450 ; 0000-0001-6061-1987 ; 0000-0001-9987-2535 ; 0000-0002-9964-4716 ; 0000-0002-2753-8615</orcidid></display><links><openurl>$$Topenurl_article</openurl><openurlfulltext>$$Topenurlfull_article</openurlfulltext><thumbnail>$$Tsyndetics_thumb_exl</thumbnail><link.rule.ids>314,780,784,27924,27925</link.rule.ids></links><search><contributor>Farouk, Ahmed</contributor><contributor>Ahmed Farouk</contributor><creatorcontrib>Mushtaq, Zaigham</creatorcontrib><creatorcontrib>Ramzan, Muhammad Farhan</creatorcontrib><creatorcontrib>Ali, Sikandar</creatorcontrib><creatorcontrib>Baseer, Samad</creatorcontrib><creatorcontrib>Samad, Ali</creatorcontrib><creatorcontrib>Husnain, Mujtaba</creatorcontrib><title>Voting Classification-Based Diabetes Mellitus Prediction Using Hypertuned Machine-Learning Techniques</title><title>Mobile information systems</title><description>Diabetes mellitus is a hyperglycemia-like chronic condition that is a troublesome disease. It is estimated that, according to the growing morbidity, by 2040, the world will cross 642 million diabetic patients. This means that each one of the ten adults will be diabetes-affected. Diabetes can also lead to other illnesses such as heart attacks, kidney damage, and even blindness. The prediction of diabetes in advance motivates us to develop a machine learning-based model. A dataset was obtained from the online repository for this work. The obtained dataset was imbalanced. An imbalanced dataset presents a challenge that is needed to be balanced for prediction using multiple machine learning like Tomek and SMOTE. These techniques remove necessary outliers that are incomplete in the provided dataset. These outliers are also managed using the IQR method. Additionally, this research employed a two-stage model selection methodology. In the first stage, logistic regression, Support Vector Machine, k-nearest neighbors, gradient boost, Naive Bayes, and Random Forests were applied to determine the efficiency of prediction based on patients’ preconditioning. At this stage, Random Forest was found to be the best with an accuracy of 80.7% after applying SMOTE oversampling technique to balance the dataset. In the second stage, three better-performing models were used by utilizing a voting algorithm. The results were encouraging, and the model obtained 82.0% accuracy with the default dataset and 81.7% accuracy with the balanced dataset. Naive Bayes Theorem, Gradient Boosting Classifier, and Random Forest were used as inputs to the voting algorithm.</description><subject>Accuracy</subject><subject>Algorithms</subject><subject>Bayes Theorem</subject><subject>Blindness</subject><subject>Body mass index</subject><subject>Classification</subject><subject>Datasets</subject><subject>Diabetes</subject><subject>Diabetes mellitus</subject><subject>Disease</subject><subject>Electronic health records</subject><subject>Glucose</subject><subject>Health care</subject><subject>Hyperglycemia</subject><subject>Insulin</subject><subject>Machine learning</subject><subject>Medical research</subject><subject>Outliers (statistics)</subject><subject>Oversampling</subject><subject>Preconditioning</subject><subject>Support vector machines</subject><subject>Voting</subject><issn>1574-017X</issn><issn>1875-905X</issn><fulltext>true</fulltext><rsrctype>article</rsrctype><creationdate>2022</creationdate><recordtype>article</recordtype><recordid>eNp9kMtOwzAQRS0EEqWw4wMisYRQP2NnCeVRpFawaFF3keNMqKuQFNsR6t_jqF2zmTvSnJmruQhdE3xPiBATiimdZIISwegJGhElRZpjsT6NvZA8xUSuz9GF91uMM8yEHCH47IJtv5Jpo723tTU62K5NH7WHKnmyuoQAPllA09jQ--TDQWXNgCQrP-zN9jtwoW8jvdBmY1tI56BdO8yWYDat_enBX6KzWjcero46RquX5-V0ls7fX9-mD_PUMCZDqkTGsRKKGKqVpnmsxmBT54zrqKoyIClXNRO4LCWQkmZMYq6pqHPAomJjdHO4u3Pd4BuKbde7NloWNONMEcW5jNTdgTKu895BXeyc_dZuXxBcDEEWQ5DFMciI3x7w-F2lf-3_9B8_b3Mn</recordid><startdate>20220319</startdate><enddate>20220319</enddate><creator>Mushtaq, Zaigham</creator><creator>Ramzan, Muhammad Farhan</creator><creator>Ali, Sikandar</creator><creator>Baseer, Samad</creator><creator>Samad, Ali</creator><creator>Husnain, Mujtaba</creator><general>Hindawi</general><general>Hindawi Limited</general><scope>RHU</scope><scope>RHW</scope><scope>RHX</scope><scope>AAYXX</scope><scope>CITATION</scope><scope>7SC</scope><scope>7SP</scope><scope>8FD</scope><scope>JQ2</scope><scope>L7M</scope><scope>L~C</scope><scope>L~D</scope><orcidid>https://orcid.org/0000-0002-3754-3450</orcidid><orcidid>https://orcid.org/0000-0001-6061-1987</orcidid><orcidid>https://orcid.org/0000-0001-9987-2535</orcidid><orcidid>https://orcid.org/0000-0002-9964-4716</orcidid><orcidid>https://orcid.org/0000-0002-2753-8615</orcidid></search><sort><creationdate>20220319</creationdate><title>Voting Classification-Based Diabetes Mellitus Prediction Using Hypertuned Machine-Learning Techniques</title><author>Mushtaq, Zaigham ; Ramzan, Muhammad Farhan ; Ali, Sikandar ; Baseer, Samad ; Samad, Ali ; Husnain, Mujtaba</author></sort><facets><frbrtype>5</frbrtype><frbrgroupid>cdi_FETCH-LOGICAL-c337t-856408581c2a8a292a8cc0cf934ac0c8dce7248f350bb7e1b263704a25f9e05d3</frbrgroupid><rsrctype>articles</rsrctype><prefilter>articles</prefilter><language>eng</language><creationdate>2022</creationdate><topic>Accuracy</topic><topic>Algorithms</topic><topic>Bayes Theorem</topic><topic>Blindness</topic><topic>Body mass index</topic><topic>Classification</topic><topic>Datasets</topic><topic>Diabetes</topic><topic>Diabetes mellitus</topic><topic>Disease</topic><topic>Electronic health records</topic><topic>Glucose</topic><topic>Health care</topic><topic>Hyperglycemia</topic><topic>Insulin</topic><topic>Machine learning</topic><topic>Medical research</topic><topic>Outliers (statistics)</topic><topic>Oversampling</topic><topic>Preconditioning</topic><topic>Support vector machines</topic><topic>Voting</topic><toplevel>peer_reviewed</toplevel><toplevel>online_resources</toplevel><creatorcontrib>Mushtaq, Zaigham</creatorcontrib><creatorcontrib>Ramzan, Muhammad Farhan</creatorcontrib><creatorcontrib>Ali, Sikandar</creatorcontrib><creatorcontrib>Baseer, Samad</creatorcontrib><creatorcontrib>Samad, Ali</creatorcontrib><creatorcontrib>Husnain, Mujtaba</creatorcontrib><collection>Hindawi Publishing Complete</collection><collection>Hindawi Publishing Subscription Journals</collection><collection>Hindawi Publishing Open Access</collection><collection>CrossRef</collection><collection>Computer and Information Systems Abstracts</collection><collection>Electronics & Communications Abstracts</collection><collection>Technology Research Database</collection><collection>ProQuest Computer Science Collection</collection><collection>Advanced Technologies Database with Aerospace</collection><collection>Computer and Information Systems Abstracts Academic</collection><collection>Computer and Information Systems Abstracts Professional</collection><jtitle>Mobile information systems</jtitle></facets><delivery><delcategory>Remote Search Resource</delcategory><fulltext>fulltext</fulltext></delivery><addata><au>Mushtaq, Zaigham</au><au>Ramzan, Muhammad Farhan</au><au>Ali, Sikandar</au><au>Baseer, Samad</au><au>Samad, Ali</au><au>Husnain, Mujtaba</au><au>Farouk, Ahmed</au><au>Ahmed Farouk</au><format>journal</format><genre>article</genre><ristype>JOUR</ristype><atitle>Voting Classification-Based Diabetes Mellitus Prediction Using Hypertuned Machine-Learning Techniques</atitle><jtitle>Mobile information systems</jtitle><date>2022-03-19</date><risdate>2022</risdate><volume>2022</volume><spage>1</spage><epage>16</epage><pages>1-16</pages><issn>1574-017X</issn><eissn>1875-905X</eissn><abstract>Diabetes mellitus is a hyperglycemia-like chronic condition that is a troublesome disease. It is estimated that, according to the growing morbidity, by 2040, the world will cross 642 million diabetic patients. This means that each one of the ten adults will be diabetes-affected. Diabetes can also lead to other illnesses such as heart attacks, kidney damage, and even blindness. The prediction of diabetes in advance motivates us to develop a machine learning-based model. A dataset was obtained from the online repository for this work. The obtained dataset was imbalanced. An imbalanced dataset presents a challenge that is needed to be balanced for prediction using multiple machine learning like Tomek and SMOTE. These techniques remove necessary outliers that are incomplete in the provided dataset. These outliers are also managed using the IQR method. Additionally, this research employed a two-stage model selection methodology. In the first stage, logistic regression, Support Vector Machine, k-nearest neighbors, gradient boost, Naive Bayes, and Random Forests were applied to determine the efficiency of prediction based on patients’ preconditioning. At this stage, Random Forest was found to be the best with an accuracy of 80.7% after applying SMOTE oversampling technique to balance the dataset. In the second stage, three better-performing models were used by utilizing a voting algorithm. The results were encouraging, and the model obtained 82.0% accuracy with the default dataset and 81.7% accuracy with the balanced dataset. Naive Bayes Theorem, Gradient Boosting Classifier, and Random Forest were used as inputs to the voting algorithm.</abstract><cop>Amsterdam</cop><pub>Hindawi</pub><doi>10.1155/2022/6521532</doi><tpages>16</tpages><orcidid>https://orcid.org/0000-0002-3754-3450</orcidid><orcidid>https://orcid.org/0000-0001-6061-1987</orcidid><orcidid>https://orcid.org/0000-0001-9987-2535</orcidid><orcidid>https://orcid.org/0000-0002-9964-4716</orcidid><orcidid>https://orcid.org/0000-0002-2753-8615</orcidid><oa>free_for_read</oa></addata></record> |
fulltext | fulltext |
identifier | ISSN: 1574-017X |
ispartof | Mobile information systems, 2022-03, Vol.2022, p.1-16 |
issn | 1574-017X 1875-905X |
language | eng |
recordid | cdi_proquest_journals_2643818447 |
source | Wiley_OA刊 |
subjects | Accuracy Algorithms Bayes Theorem Blindness Body mass index Classification Datasets Diabetes Diabetes mellitus Disease Electronic health records Glucose Health care Hyperglycemia Insulin Machine learning Medical research Outliers (statistics) Oversampling Preconditioning Support vector machines Voting |
title | Voting Classification-Based Diabetes Mellitus Prediction Using Hypertuned Machine-Learning Techniques |
url | http://sfxeu10.hosted.exlibrisgroup.com/loughborough?ctx_ver=Z39.88-2004&ctx_enc=info:ofi/enc:UTF-8&ctx_tim=2024-12-25T15%3A34%3A59IST&url_ver=Z39.88-2004&url_ctx_fmt=infofi/fmt:kev:mtx:ctx&rfr_id=info:sid/primo.exlibrisgroup.com:primo3-Article-proquest_cross&rft_val_fmt=info:ofi/fmt:kev:mtx:journal&rft.genre=article&rft.atitle=Voting%20Classification-Based%20Diabetes%20Mellitus%20Prediction%20Using%20Hypertuned%20Machine-Learning%20Techniques&rft.jtitle=Mobile%20information%20systems&rft.au=Mushtaq,%20Zaigham&rft.date=2022-03-19&rft.volume=2022&rft.spage=1&rft.epage=16&rft.pages=1-16&rft.issn=1574-017X&rft.eissn=1875-905X&rft_id=info:doi/10.1155/2022/6521532&rft_dat=%3Cproquest_cross%3E2643818447%3C/proquest_cross%3E%3Cgrp_id%3Ecdi_FETCH-LOGICAL-c337t-856408581c2a8a292a8cc0cf934ac0c8dce7248f350bb7e1b263704a25f9e05d3%3C/grp_id%3E%3Coa%3E%3C/oa%3E%3Curl%3E%3C/url%3E&rft_id=info:oai/&rft_pqid=2643818447&rft_id=info:pmid/&rfr_iscdi=true |