Loading…

Application of Machine Learning Models for Early Detection and Accurate Classification of Type 2 Diabetes

Early detection of diabetes is essential to prevent serious complications in patients. The purpose of this work is to detect and classify type 2 diabetes in patients using machine learning (ML) models, and to select the most optimal model to predict the risk of diabetes. In this paper, five ML model...

Full description

Saved in:
Bibliographic Details
Published in:Diagnostics (Basel) 2023-07, Vol.13 (14), p.2383
Main Authors: Iparraguirre-Villanueva, Orlando, Espinola-Linares, Karina, Flores Castañeda, Rosalynn Ornella, Cabanillas-Carbonell, Michael
Format: Article
Language:English
Subjects:
Citations: Items that this one cites
Items that cite this one
Online Access:Get full text
Tags: Add Tag
No Tags, Be the first to tag this record!
cited_by cdi_FETCH-LOGICAL-c567t-8a87fb6aa69d6bfc5a4c54f0159d62c27ecfa2eea40258e59dfcbaefb78c37343
cites cdi_FETCH-LOGICAL-c567t-8a87fb6aa69d6bfc5a4c54f0159d62c27ecfa2eea40258e59dfcbaefb78c37343
container_end_page
container_issue 14
container_start_page 2383
container_title Diagnostics (Basel)
container_volume 13
creator Iparraguirre-Villanueva, Orlando
Espinola-Linares, Karina
Flores Castañeda, Rosalynn Ornella
Cabanillas-Carbonell, Michael
description Early detection of diabetes is essential to prevent serious complications in patients. The purpose of this work is to detect and classify type 2 diabetes in patients using machine learning (ML) models, and to select the most optimal model to predict the risk of diabetes. In this paper, five ML models, including K-nearest neighbor (K-NN), Bernoulli Naïve Bayes (BNB), decision tree (DT), logistic regression (LR), and support vector machine (SVM), are investigated to predict diabetic patients. A Kaggle-hosted Pima Indian dataset containing 768 patients with and without diabetes was used, including variables such as number of pregnancies the patient has had, blood glucose concentration, diastolic blood pressure, skinfold thickness, body insulin levels, body mass index (BMI), genetic background, diabetes in the family tree, age, and outcome (with/without diabetes). The results show that the K-NN and BNB models outperform the other models. The K-NN model obtained the best accuracy in detecting diabetes, with 79.6% accuracy, while the BNB model obtained 77.2% accuracy in detecting diabetes. Finally, it can be stated that the use of ML models for the early detection of diabetes is very promising.
doi_str_mv 10.3390/diagnostics13142383
format article
fullrecord <record><control><sourceid>gale_doaj_</sourceid><recordid>TN_cdi_doaj_primary_oai_doaj_org_article_ea9bd23a226b45a99239683da3e77989</recordid><sourceformat>XML</sourceformat><sourcesystem>PC</sourcesystem><galeid>A759040145</galeid><doaj_id>oai_doaj_org_article_ea9bd23a226b45a99239683da3e77989</doaj_id><sourcerecordid>A759040145</sourcerecordid><originalsourceid>FETCH-LOGICAL-c567t-8a87fb6aa69d6bfc5a4c54f0159d62c27ecfa2eea40258e59dfcbaefb78c37343</originalsourceid><addsrcrecordid>eNptkk1vEzEQhlcIRKvSX4CEVuLCJcWfa-8JRWmBSqm4lLM16x1vHW3sYG-Q8u9xmlISVPvg8fh9H2tGU1XvKbnivCWfew9DiHnyNlNOBeOav6rOGVFyJgTVr4_is-oy5xUpq6VcM_m2OuNKUkKZOq_8fLMZvYXJx1BHV9-BffAB6yVCCj4M9V3sccy1i6m-gTTu6muc0D7KIfT13NptggnrxQg5e3eEut9tsGb1tYeuWPK76o2DMePl03lR_fx6c7_4Plv--Ha7mC9nVjZqmmnQynUNQNP2TeesBGGlcITKcmeWKbQOGCIIwqTGknW2A3Sd0pYrLvhFdXvg9hFWZpP8GtLORPDmMRHTYCCVvo1oENquZxwYazohoW0ZbxvNe-CoVKvbwvpyYG223Rp7i2FKMJ5AT1-CfzBD_G0o4UoXWiF8eiKk-GuLeTJrny2OIwSM22yYFoJoJpgu0o__SVdxm0Lp1V7FiWwoJ_9UA5QKfHCxfGz3UDNXsiWCUCGL6uoFVdk9rr2NAZ0v-RMDPxhsijkndM9FUmL2E2demLji-nDcn2fP3_nifwCYitPG</addsrcrecordid><sourcetype>Open Website</sourcetype><iscdi>true</iscdi><recordtype>article</recordtype><pqid>2843056130</pqid></control><display><type>article</type><title>Application of Machine Learning Models for Early Detection and Accurate Classification of Type 2 Diabetes</title><source>Open Access: PubMed Central</source><source>Publicly Available Content Database (Proquest) (PQ_SDU_P3)</source><creator>Iparraguirre-Villanueva, Orlando ; Espinola-Linares, Karina ; Flores Castañeda, Rosalynn Ornella ; Cabanillas-Carbonell, Michael</creator><creatorcontrib>Iparraguirre-Villanueva, Orlando ; Espinola-Linares, Karina ; Flores Castañeda, Rosalynn Ornella ; Cabanillas-Carbonell, Michael</creatorcontrib><description>Early detection of diabetes is essential to prevent serious complications in patients. The purpose of this work is to detect and classify type 2 diabetes in patients using machine learning (ML) models, and to select the most optimal model to predict the risk of diabetes. In this paper, five ML models, including K-nearest neighbor (K-NN), Bernoulli Naïve Bayes (BNB), decision tree (DT), logistic regression (LR), and support vector machine (SVM), are investigated to predict diabetic patients. A Kaggle-hosted Pima Indian dataset containing 768 patients with and without diabetes was used, including variables such as number of pregnancies the patient has had, blood glucose concentration, diastolic blood pressure, skinfold thickness, body insulin levels, body mass index (BMI), genetic background, diabetes in the family tree, age, and outcome (with/without diabetes). The results show that the K-NN and BNB models outperform the other models. The K-NN model obtained the best accuracy in detecting diabetes, with 79.6% accuracy, while the BNB model obtained 77.2% accuracy in detecting diabetes. Finally, it can be stated that the use of ML models for the early detection of diabetes is very promising.</description><identifier>ISSN: 2075-4418</identifier><identifier>EISSN: 2075-4418</identifier><identifier>DOI: 10.3390/diagnostics13142383</identifier><identifier>PMID: 37510127</identifier><language>eng</language><publisher>Switzerland: MDPI AG</publisher><subject>Accuracy ; Adults ; Algorithms ; analysis ; Artificial intelligence ; Blood sugar ; Classification ; Datasets ; Diabetes ; Diabetics ; Disease ; Insulin ; Machine learning ; Mathematical functions ; Medical research ; Medicine, Experimental ; modeling ; Neural networks ; Patients ; Risk factors ; Type 2 diabetes</subject><ispartof>Diagnostics (Basel), 2023-07, Vol.13 (14), p.2383</ispartof><rights>COPYRIGHT 2023 MDPI AG</rights><rights>2023 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (https://creativecommons.org/licenses/by/4.0/). Notwithstanding the ProQuest Terms and Conditions, you may use this content in accordance with the terms of the License.</rights><rights>2023 by the authors. 2023</rights><lds50>peer_reviewed</lds50><oa>free_for_read</oa><woscitedreferencessubscribed>false</woscitedreferencessubscribed><citedby>FETCH-LOGICAL-c567t-8a87fb6aa69d6bfc5a4c54f0159d62c27ecfa2eea40258e59dfcbaefb78c37343</citedby><cites>FETCH-LOGICAL-c567t-8a87fb6aa69d6bfc5a4c54f0159d62c27ecfa2eea40258e59dfcbaefb78c37343</cites><orcidid>0000-0002-5573-359X ; 0000-0001-9675-0970 ; 0000-0001-8185-2034</orcidid></display><links><openurl>$$Topenurl_article</openurl><openurlfulltext>$$Topenurlfull_article</openurlfulltext><thumbnail>$$Tsyndetics_thumb_exl</thumbnail><linktopdf>$$Uhttps://www.proquest.com/docview/2843056130/fulltextPDF?pq-origsite=primo$$EPDF$$P50$$Gproquest$$Hfree_for_read</linktopdf><linktohtml>$$Uhttps://www.proquest.com/docview/2843056130?pq-origsite=primo$$EHTML$$P50$$Gproquest$$Hfree_for_read</linktohtml><link.rule.ids>230,314,727,780,784,885,25753,27924,27925,37012,37013,44590,53791,53793,75126</link.rule.ids><backlink>$$Uhttps://www.ncbi.nlm.nih.gov/pubmed/37510127$$D View this record in MEDLINE/PubMed$$Hfree_for_read</backlink></links><search><creatorcontrib>Iparraguirre-Villanueva, Orlando</creatorcontrib><creatorcontrib>Espinola-Linares, Karina</creatorcontrib><creatorcontrib>Flores Castañeda, Rosalynn Ornella</creatorcontrib><creatorcontrib>Cabanillas-Carbonell, Michael</creatorcontrib><title>Application of Machine Learning Models for Early Detection and Accurate Classification of Type 2 Diabetes</title><title>Diagnostics (Basel)</title><addtitle>Diagnostics (Basel)</addtitle><description>Early detection of diabetes is essential to prevent serious complications in patients. The purpose of this work is to detect and classify type 2 diabetes in patients using machine learning (ML) models, and to select the most optimal model to predict the risk of diabetes. In this paper, five ML models, including K-nearest neighbor (K-NN), Bernoulli Naïve Bayes (BNB), decision tree (DT), logistic regression (LR), and support vector machine (SVM), are investigated to predict diabetic patients. A Kaggle-hosted Pima Indian dataset containing 768 patients with and without diabetes was used, including variables such as number of pregnancies the patient has had, blood glucose concentration, diastolic blood pressure, skinfold thickness, body insulin levels, body mass index (BMI), genetic background, diabetes in the family tree, age, and outcome (with/without diabetes). The results show that the K-NN and BNB models outperform the other models. The K-NN model obtained the best accuracy in detecting diabetes, with 79.6% accuracy, while the BNB model obtained 77.2% accuracy in detecting diabetes. Finally, it can be stated that the use of ML models for the early detection of diabetes is very promising.</description><subject>Accuracy</subject><subject>Adults</subject><subject>Algorithms</subject><subject>analysis</subject><subject>Artificial intelligence</subject><subject>Blood sugar</subject><subject>Classification</subject><subject>Datasets</subject><subject>Diabetes</subject><subject>Diabetics</subject><subject>Disease</subject><subject>Insulin</subject><subject>Machine learning</subject><subject>Mathematical functions</subject><subject>Medical research</subject><subject>Medicine, Experimental</subject><subject>modeling</subject><subject>Neural networks</subject><subject>Patients</subject><subject>Risk factors</subject><subject>Type 2 diabetes</subject><issn>2075-4418</issn><issn>2075-4418</issn><fulltext>true</fulltext><rsrctype>article</rsrctype><creationdate>2023</creationdate><recordtype>article</recordtype><sourceid>PIMPY</sourceid><sourceid>DOA</sourceid><recordid>eNptkk1vEzEQhlcIRKvSX4CEVuLCJcWfa-8JRWmBSqm4lLM16x1vHW3sYG-Q8u9xmlISVPvg8fh9H2tGU1XvKbnivCWfew9DiHnyNlNOBeOav6rOGVFyJgTVr4_is-oy5xUpq6VcM_m2OuNKUkKZOq_8fLMZvYXJx1BHV9-BffAB6yVCCj4M9V3sccy1i6m-gTTu6muc0D7KIfT13NptggnrxQg5e3eEut9tsGb1tYeuWPK76o2DMePl03lR_fx6c7_4Plv--Ha7mC9nVjZqmmnQynUNQNP2TeesBGGlcITKcmeWKbQOGCIIwqTGknW2A3Sd0pYrLvhFdXvg9hFWZpP8GtLORPDmMRHTYCCVvo1oENquZxwYazohoW0ZbxvNe-CoVKvbwvpyYG223Rp7i2FKMJ5AT1-CfzBD_G0o4UoXWiF8eiKk-GuLeTJrny2OIwSM22yYFoJoJpgu0o__SVdxm0Lp1V7FiWwoJ_9UA5QKfHCxfGz3UDNXsiWCUCGL6uoFVdk9rr2NAZ0v-RMDPxhsijkndM9FUmL2E2demLji-nDcn2fP3_nifwCYitPG</recordid><startdate>20230701</startdate><enddate>20230701</enddate><creator>Iparraguirre-Villanueva, Orlando</creator><creator>Espinola-Linares, Karina</creator><creator>Flores Castañeda, Rosalynn Ornella</creator><creator>Cabanillas-Carbonell, Michael</creator><general>MDPI AG</general><general>MDPI</general><scope>NPM</scope><scope>AAYXX</scope><scope>CITATION</scope><scope>3V.</scope><scope>7XB</scope><scope>8FK</scope><scope>8G5</scope><scope>ABUWG</scope><scope>AFKRA</scope><scope>AZQEC</scope><scope>BENPR</scope><scope>CCPQU</scope><scope>DWQXO</scope><scope>GNUQQ</scope><scope>GUQSH</scope><scope>M2O</scope><scope>MBDVC</scope><scope>PIMPY</scope><scope>PQEST</scope><scope>PQQKQ</scope><scope>PQUKI</scope><scope>PRINS</scope><scope>Q9U</scope><scope>7X8</scope><scope>5PM</scope><scope>DOA</scope><orcidid>https://orcid.org/0000-0002-5573-359X</orcidid><orcidid>https://orcid.org/0000-0001-9675-0970</orcidid><orcidid>https://orcid.org/0000-0001-8185-2034</orcidid></search><sort><creationdate>20230701</creationdate><title>Application of Machine Learning Models for Early Detection and Accurate Classification of Type 2 Diabetes</title><author>Iparraguirre-Villanueva, Orlando ; Espinola-Linares, Karina ; Flores Castañeda, Rosalynn Ornella ; Cabanillas-Carbonell, Michael</author></sort><facets><frbrtype>5</frbrtype><frbrgroupid>cdi_FETCH-LOGICAL-c567t-8a87fb6aa69d6bfc5a4c54f0159d62c27ecfa2eea40258e59dfcbaefb78c37343</frbrgroupid><rsrctype>articles</rsrctype><prefilter>articles</prefilter><language>eng</language><creationdate>2023</creationdate><topic>Accuracy</topic><topic>Adults</topic><topic>Algorithms</topic><topic>analysis</topic><topic>Artificial intelligence</topic><topic>Blood sugar</topic><topic>Classification</topic><topic>Datasets</topic><topic>Diabetes</topic><topic>Diabetics</topic><topic>Disease</topic><topic>Insulin</topic><topic>Machine learning</topic><topic>Mathematical functions</topic><topic>Medical research</topic><topic>Medicine, Experimental</topic><topic>modeling</topic><topic>Neural networks</topic><topic>Patients</topic><topic>Risk factors</topic><topic>Type 2 diabetes</topic><toplevel>peer_reviewed</toplevel><toplevel>online_resources</toplevel><creatorcontrib>Iparraguirre-Villanueva, Orlando</creatorcontrib><creatorcontrib>Espinola-Linares, Karina</creatorcontrib><creatorcontrib>Flores Castañeda, Rosalynn Ornella</creatorcontrib><creatorcontrib>Cabanillas-Carbonell, Michael</creatorcontrib><collection>PubMed</collection><collection>CrossRef</collection><collection>ProQuest Central (Corporate)</collection><collection>ProQuest Central (purchase pre-March 2016)</collection><collection>ProQuest Central (Alumni) (purchase pre-March 2016)</collection><collection>Research Library (Alumni Edition)</collection><collection>ProQuest Central (Alumni)</collection><collection>ProQuest Central</collection><collection>ProQuest Central Essentials</collection><collection>ProQuest Central</collection><collection>ProQuest One Community College</collection><collection>ProQuest Central Korea</collection><collection>ProQuest Central Student</collection><collection>Research Library Prep</collection><collection>Research Library</collection><collection>Research Library (Corporate)</collection><collection>Publicly Available Content Database (Proquest) (PQ_SDU_P3)</collection><collection>ProQuest One Academic Eastern Edition (DO NOT USE)</collection><collection>ProQuest One Academic</collection><collection>ProQuest One Academic UKI Edition</collection><collection>ProQuest Central China</collection><collection>ProQuest Central Basic</collection><collection>MEDLINE - Academic</collection><collection>PubMed Central (Full Participant titles)</collection><collection>DOAJ Directory of Open Access Journals</collection><jtitle>Diagnostics (Basel)</jtitle></facets><delivery><delcategory>Remote Search Resource</delcategory><fulltext>fulltext</fulltext></delivery><addata><au>Iparraguirre-Villanueva, Orlando</au><au>Espinola-Linares, Karina</au><au>Flores Castañeda, Rosalynn Ornella</au><au>Cabanillas-Carbonell, Michael</au><format>journal</format><genre>article</genre><ristype>JOUR</ristype><atitle>Application of Machine Learning Models for Early Detection and Accurate Classification of Type 2 Diabetes</atitle><jtitle>Diagnostics (Basel)</jtitle><addtitle>Diagnostics (Basel)</addtitle><date>2023-07-01</date><risdate>2023</risdate><volume>13</volume><issue>14</issue><spage>2383</spage><pages>2383-</pages><issn>2075-4418</issn><eissn>2075-4418</eissn><abstract>Early detection of diabetes is essential to prevent serious complications in patients. The purpose of this work is to detect and classify type 2 diabetes in patients using machine learning (ML) models, and to select the most optimal model to predict the risk of diabetes. In this paper, five ML models, including K-nearest neighbor (K-NN), Bernoulli Naïve Bayes (BNB), decision tree (DT), logistic regression (LR), and support vector machine (SVM), are investigated to predict diabetic patients. A Kaggle-hosted Pima Indian dataset containing 768 patients with and without diabetes was used, including variables such as number of pregnancies the patient has had, blood glucose concentration, diastolic blood pressure, skinfold thickness, body insulin levels, body mass index (BMI), genetic background, diabetes in the family tree, age, and outcome (with/without diabetes). The results show that the K-NN and BNB models outperform the other models. The K-NN model obtained the best accuracy in detecting diabetes, with 79.6% accuracy, while the BNB model obtained 77.2% accuracy in detecting diabetes. Finally, it can be stated that the use of ML models for the early detection of diabetes is very promising.</abstract><cop>Switzerland</cop><pub>MDPI AG</pub><pmid>37510127</pmid><doi>10.3390/diagnostics13142383</doi><orcidid>https://orcid.org/0000-0002-5573-359X</orcidid><orcidid>https://orcid.org/0000-0001-9675-0970</orcidid><orcidid>https://orcid.org/0000-0001-8185-2034</orcidid><oa>free_for_read</oa></addata></record>
fulltext fulltext
identifier ISSN: 2075-4418
ispartof Diagnostics (Basel), 2023-07, Vol.13 (14), p.2383
issn 2075-4418
2075-4418
language eng
recordid cdi_doaj_primary_oai_doaj_org_article_ea9bd23a226b45a99239683da3e77989
source Open Access: PubMed Central; Publicly Available Content Database (Proquest) (PQ_SDU_P3)
subjects Accuracy
Adults
Algorithms
analysis
Artificial intelligence
Blood sugar
Classification
Datasets
Diabetes
Diabetics
Disease
Insulin
Machine learning
Mathematical functions
Medical research
Medicine, Experimental
modeling
Neural networks
Patients
Risk factors
Type 2 diabetes
title Application of Machine Learning Models for Early Detection and Accurate Classification of Type 2 Diabetes
url http://sfxeu10.hosted.exlibrisgroup.com/loughborough?ctx_ver=Z39.88-2004&ctx_enc=info:ofi/enc:UTF-8&ctx_tim=2024-12-29T16%3A47%3A31IST&url_ver=Z39.88-2004&url_ctx_fmt=infofi/fmt:kev:mtx:ctx&rfr_id=info:sid/primo.exlibrisgroup.com:primo3-Article-gale_doaj_&rft_val_fmt=info:ofi/fmt:kev:mtx:journal&rft.genre=article&rft.atitle=Application%20of%20Machine%20Learning%20Models%20for%20Early%20Detection%20and%20Accurate%20Classification%20of%20Type%202%20Diabetes&rft.jtitle=Diagnostics%20(Basel)&rft.au=Iparraguirre-Villanueva,%20Orlando&rft.date=2023-07-01&rft.volume=13&rft.issue=14&rft.spage=2383&rft.pages=2383-&rft.issn=2075-4418&rft.eissn=2075-4418&rft_id=info:doi/10.3390/diagnostics13142383&rft_dat=%3Cgale_doaj_%3EA759040145%3C/gale_doaj_%3E%3Cgrp_id%3Ecdi_FETCH-LOGICAL-c567t-8a87fb6aa69d6bfc5a4c54f0159d62c27ecfa2eea40258e59dfcbaefb78c37343%3C/grp_id%3E%3Coa%3E%3C/oa%3E%3Curl%3E%3C/url%3E&rft_id=info:oai/&rft_pqid=2843056130&rft_id=info:pmid/37510127&rft_galeid=A759040145&rfr_iscdi=true