Loading…

Systematic comparison of machine learning algorithms to develop and validate predictive models for periodontitis

Aim The aim of this study was to compare the validity of different machine learning algorithms to develop and validate predictive models for periodontitis. Materials and Methods Using national survey data from Taiwan (n = 3453) and the United States (n = 3685), predictors of periodontitis were extra...

Full description

Saved in:

Bibliographic Details
Published in:	Journal of clinical periodontology 2022-10, Vol.49 (10), p.958-969
Main Authors:	Bashir, Nasir Z., Rahman, Zahid, Chen, Sam Li‐Sheng
Format:	Article
Language:	English
Subjects:	Algorithms computing Diagnosis, Epidemiology and Associated Co‐morbidities Gum disease Humans Learning algorithms Machine Learning Original Periodontitis Periodontitis - diagnosis Prediction models predictive modelling Predictive Value of Tests ROC Curve statistics
Citations:	Items that this one cites Items that cite this one
Online Access:	Get full text
Tags:	Add Tag No Tags, Be the first to tag this record!

cited_by	cdi_FETCH-LOGICAL-c4482-18b501b0d259f8deb2e7bd3c65c77f7e6a0b7bef923aee576eba88c524f1f9213
cites	cdi_FETCH-LOGICAL-c4482-18b501b0d259f8deb2e7bd3c65c77f7e6a0b7bef923aee576eba88c524f1f9213
container_end_page	969
container_issue	10
container_start_page	958
container_title	Journal of clinical periodontology
container_volume	49
creator	Bashir, Nasir Z. Rahman, Zahid Chen, Sam Li‐Sheng
description	Aim The aim of this study was to compare the validity of different machine learning algorithms to develop and validate predictive models for periodontitis. Materials and Methods Using national survey data from Taiwan (n = 3453) and the United States (n = 3685), predictors of periodontitis were extracted from the datasets and pre‐processed, and then 10 machine learning algorithms were trained to develop predictive models. The models were validated both internally (bootstrap sampling) and externally (alternative country's dataset). The algorithms were compared across six performance metrics ([i] area under the curve for the receiver operating characteristic [AUC], [ii] accuracy, [iii] sensitivity, [iv] specificity, [v] positive predictive value, and [vi] negative predictive value) and two methods of data pre‐processing ([i] machine‐learning‐based feature selection and [ii] dimensionality reduction into principal components). Results Many algorithms showed extremely strong performance during internal validation (AUC > 0.95, accuracy > 95%). However, this was not replicated in external validation, where predictive performance of all algorithms dropped off drastically. Furthermore, predictive performance differed according to data pre‐processing methodology and the cohort on which they were trained. Conclusions Larger sample sizes and more complex predictors of periodontitis are required before machine learning can be leveraged to its full potential.
doi_str_mv	10.1111/jcpe.13692
format	article
fullrecord	<record><control><sourceid>proquest_pubme</sourceid><recordid>TN_cdi_pubmedcentral_primary_oai_pubmedcentral_nih_gov_9796669</recordid><sourceformat>XML</sourceformat><sourcesystem>PC</sourcesystem><sourcerecordid>2685033618</sourcerecordid><originalsourceid>FETCH-LOGICAL-c4482-18b501b0d259f8deb2e7bd3c65c77f7e6a0b7bef923aee576eba88c524f1f9213</originalsourceid><addsrcrecordid>eNp9kc2KFDEUhYMoTju68QEk4EaEGvNTSao2wtCMfwwoqOAupJJb3WlSSZmkW_rtrbHHQV2YTSD5-Dj3HoSeUnJBl_NqZ2e4oFz27B5aUUlIQwT9dh-tCCe8kb3qz9CjUnaEUMU5f4jOuFAdVYyt0Pz5WCpMpnqLbZpmk31JEacRT8ZufQQcwOTo4wabsEnZ1-1UcE3YwQFCmrGJDh9M8M5UwHMG5231B8BTchAKHlPGM2SfXIrVV18eowejCQWe3N7n6Oubqy_rd831x7fv15fXjW3bjjW0GwShA3FM9GPnYGCgBsetFFapUYE0ZFADjD3jBkAoCYPpOitYO9LlkfJz9PrknffDBM5CrNkEPWc_mXzUyXj990_0W71JB72sS0rZL4IXt4Kcvu-hVD35YiEEEyHti2ayE4RzSbsFff4Pukv7HJfxNFNUyK5thVqolyfK5lRKhvEuDCX6pkh9U6T-VeQCP_sz_h36u7kFoCfghw9w_I9Kf1h_ujpJfwIjWayL</addsrcrecordid><sourcetype>Open Access Repository</sourcetype><iscdi>true</iscdi><recordtype>article</recordtype><pqid>2715684457</pqid></control><display><type>article</type><title>Systematic comparison of machine learning algorithms to develop and validate predictive models for periodontitis</title><source>Wiley-Blackwell Read & Publish Collection</source><creator>Bashir, Nasir Z. ; Rahman, Zahid ; Chen, Sam Li‐Sheng</creator><creatorcontrib>Bashir, Nasir Z. ; Rahman, Zahid ; Chen, Sam Li‐Sheng</creatorcontrib><description>Aim The aim of this study was to compare the validity of different machine learning algorithms to develop and validate predictive models for periodontitis. Materials and Methods Using national survey data from Taiwan (n = 3453) and the United States (n = 3685), predictors of periodontitis were extracted from the datasets and pre‐processed, and then 10 machine learning algorithms were trained to develop predictive models. The models were validated both internally (bootstrap sampling) and externally (alternative country's dataset). The algorithms were compared across six performance metrics ([i] area under the curve for the receiver operating characteristic [AUC], [ii] accuracy, [iii] sensitivity, [iv] specificity, [v] positive predictive value, and [vi] negative predictive value) and two methods of data pre‐processing ([i] machine‐learning‐based feature selection and [ii] dimensionality reduction into principal components). Results Many algorithms showed extremely strong performance during internal validation (AUC > 0.95, accuracy > 95%). However, this was not replicated in external validation, where predictive performance of all algorithms dropped off drastically. Furthermore, predictive performance differed according to data pre‐processing methodology and the cohort on which they were trained. Conclusions Larger sample sizes and more complex predictors of periodontitis are required before machine learning can be leveraged to its full potential.</description><identifier>ISSN: 0303-6979</identifier><identifier>EISSN: 1600-051X</identifier><identifier>DOI: 10.1111/jcpe.13692</identifier><identifier>PMID: 35781722</identifier><language>eng</language><publisher>Oxford, UK: Blackwell Publishing Ltd</publisher><subject>Algorithms ; computing ; Diagnosis, Epidemiology and Associated Co‐morbidities ; Gum disease ; Humans ; Learning algorithms ; Machine Learning ; Original ; Periodontitis ; Periodontitis - diagnosis ; Prediction models ; predictive modelling ; Predictive Value of Tests ; ROC Curve ; statistics</subject><ispartof>Journal of clinical periodontology, 2022-10, Vol.49 (10), p.958-969</ispartof><rights>2022 The Authors. published by John Wiley & Sons Ltd.</rights><rights>2022 The Authors. Journal of Clinical Periodontology published by John Wiley & Sons Ltd.</rights><rights>2022. This article is published under http://creativecommons.org/licenses/by-nc/4.0/ (the “License”). Notwithstanding the ProQuest Terms and Conditions, you may use this content in accordance with the terms of the License.</rights><lds50>peer_reviewed</lds50><oa>free_for_read</oa><woscitedreferencessubscribed>false</woscitedreferencessubscribed><citedby>FETCH-LOGICAL-c4482-18b501b0d259f8deb2e7bd3c65c77f7e6a0b7bef923aee576eba88c524f1f9213</citedby><cites>FETCH-LOGICAL-c4482-18b501b0d259f8deb2e7bd3c65c77f7e6a0b7bef923aee576eba88c524f1f9213</cites><orcidid>0000-0001-7416-7610 ; 0000-0001-9750-3015</orcidid></display><links><openurl>$$Topenurl_article</openurl><openurlfulltext>$$Topenurlfull_article</openurlfulltext><thumbnail>$$Tsyndetics_thumb_exl</thumbnail><link.rule.ids>230,314,780,784,885,27924,27925</link.rule.ids><backlink>$$Uhttps://www.ncbi.nlm.nih.gov/pubmed/35781722$$D View this record in MEDLINE/PubMed$$Hfree_for_read</backlink></links><search><creatorcontrib>Bashir, Nasir Z.</creatorcontrib><creatorcontrib>Rahman, Zahid</creatorcontrib><creatorcontrib>Chen, Sam Li‐Sheng</creatorcontrib><title>Systematic comparison of machine learning algorithms to develop and validate predictive models for periodontitis</title><title>Journal of clinical periodontology</title><addtitle>J Clin Periodontol</addtitle><description>Aim The aim of this study was to compare the validity of different machine learning algorithms to develop and validate predictive models for periodontitis. Materials and Methods Using national survey data from Taiwan (n = 3453) and the United States (n = 3685), predictors of periodontitis were extracted from the datasets and pre‐processed, and then 10 machine learning algorithms were trained to develop predictive models. The models were validated both internally (bootstrap sampling) and externally (alternative country's dataset). The algorithms were compared across six performance metrics ([i] area under the curve for the receiver operating characteristic [AUC], [ii] accuracy, [iii] sensitivity, [iv] specificity, [v] positive predictive value, and [vi] negative predictive value) and two methods of data pre‐processing ([i] machine‐learning‐based feature selection and [ii] dimensionality reduction into principal components). Results Many algorithms showed extremely strong performance during internal validation (AUC > 0.95, accuracy > 95%). However, this was not replicated in external validation, where predictive performance of all algorithms dropped off drastically. Furthermore, predictive performance differed according to data pre‐processing methodology and the cohort on which they were trained. Conclusions Larger sample sizes and more complex predictors of periodontitis are required before machine learning can be leveraged to its full potential.</description><subject>Algorithms</subject><subject>computing</subject><subject>Diagnosis, Epidemiology and Associated Co‐morbidities</subject><subject>Gum disease</subject><subject>Humans</subject><subject>Learning algorithms</subject><subject>Machine Learning</subject><subject>Original</subject><subject>Periodontitis</subject><subject>Periodontitis - diagnosis</subject><subject>Prediction models</subject><subject>predictive modelling</subject><subject>Predictive Value of Tests</subject><subject>ROC Curve</subject><subject>statistics</subject><issn>0303-6979</issn><issn>1600-051X</issn><fulltext>true</fulltext><rsrctype>article</rsrctype><creationdate>2022</creationdate><recordtype>article</recordtype><sourceid>24P</sourceid><recordid>eNp9kc2KFDEUhYMoTju68QEk4EaEGvNTSao2wtCMfwwoqOAupJJb3WlSSZmkW_rtrbHHQV2YTSD5-Dj3HoSeUnJBl_NqZ2e4oFz27B5aUUlIQwT9dh-tCCe8kb3qz9CjUnaEUMU5f4jOuFAdVYyt0Pz5WCpMpnqLbZpmk31JEacRT8ZufQQcwOTo4wabsEnZ1-1UcE3YwQFCmrGJDh9M8M5UwHMG5231B8BTchAKHlPGM2SfXIrVV18eowejCQWe3N7n6Oubqy_rd831x7fv15fXjW3bjjW0GwShA3FM9GPnYGCgBsetFFapUYE0ZFADjD3jBkAoCYPpOitYO9LlkfJz9PrknffDBM5CrNkEPWc_mXzUyXj990_0W71JB72sS0rZL4IXt4Kcvu-hVD35YiEEEyHti2ayE4RzSbsFff4Pukv7HJfxNFNUyK5thVqolyfK5lRKhvEuDCX6pkh9U6T-VeQCP_sz_h36u7kFoCfghw9w_I9Kf1h_ujpJfwIjWayL</recordid><startdate>202210</startdate><enddate>202210</enddate><creator>Bashir, Nasir Z.</creator><creator>Rahman, Zahid</creator><creator>Chen, Sam Li‐Sheng</creator><general>Blackwell Publishing Ltd</general><scope>24P</scope><scope>WIN</scope><scope>CGR</scope><scope>CUY</scope><scope>CVF</scope><scope>ECM</scope><scope>EIF</scope><scope>NPM</scope><scope>AAYXX</scope><scope>CITATION</scope><scope>7QP</scope><scope>K9.</scope><scope>7X8</scope><scope>5PM</scope><orcidid>https://orcid.org/0000-0001-7416-7610</orcidid><orcidid>https://orcid.org/0000-0001-9750-3015</orcidid></search><sort><creationdate>202210</creationdate><title>Systematic comparison of machine learning algorithms to develop and validate predictive models for periodontitis</title><author>Bashir, Nasir Z. ; Rahman, Zahid ; Chen, Sam Li‐Sheng</author></sort><facets><frbrtype>5</frbrtype><frbrgroupid>cdi_FETCH-LOGICAL-c4482-18b501b0d259f8deb2e7bd3c65c77f7e6a0b7bef923aee576eba88c524f1f9213</frbrgroupid><rsrctype>articles</rsrctype><prefilter>articles</prefilter><language>eng</language><creationdate>2022</creationdate><topic>Algorithms</topic><topic>computing</topic><topic>Diagnosis, Epidemiology and Associated Co‐morbidities</topic><topic>Gum disease</topic><topic>Humans</topic><topic>Learning algorithms</topic><topic>Machine Learning</topic><topic>Original</topic><topic>Periodontitis</topic><topic>Periodontitis - diagnosis</topic><topic>Prediction models</topic><topic>predictive modelling</topic><topic>Predictive Value of Tests</topic><topic>ROC Curve</topic><topic>statistics</topic><toplevel>peer_reviewed</toplevel><toplevel>online_resources</toplevel><creatorcontrib>Bashir, Nasir Z.</creatorcontrib><creatorcontrib>Rahman, Zahid</creatorcontrib><creatorcontrib>Chen, Sam Li‐Sheng</creatorcontrib><collection>Wiley_OA刊</collection><collection>Wiley Online Library Free Content</collection><collection>Medline</collection><collection>MEDLINE</collection><collection>MEDLINE (Ovid)</collection><collection>MEDLINE</collection><collection>MEDLINE</collection><collection>PubMed</collection><collection>CrossRef</collection><collection>Calcium & Calcified Tissue Abstracts</collection><collection>ProQuest Health & Medical Complete (Alumni)</collection><collection>MEDLINE - Academic</collection><collection>PubMed Central (Full Participant titles)</collection><jtitle>Journal of clinical periodontology</jtitle></facets><delivery><delcategory>Remote Search Resource</delcategory><fulltext>fulltext</fulltext></delivery><addata><au>Bashir, Nasir Z.</au><au>Rahman, Zahid</au><au>Chen, Sam Li‐Sheng</au><format>journal</format><genre>article</genre><ristype>JOUR</ristype><atitle>Systematic comparison of machine learning algorithms to develop and validate predictive models for periodontitis</atitle><jtitle>Journal of clinical periodontology</jtitle><addtitle>J Clin Periodontol</addtitle><date>2022-10</date><risdate>2022</risdate><volume>49</volume><issue>10</issue><spage>958</spage><epage>969</epage><pages>958-969</pages><issn>0303-6979</issn><eissn>1600-051X</eissn><abstract>Aim The aim of this study was to compare the validity of different machine learning algorithms to develop and validate predictive models for periodontitis. Materials and Methods Using national survey data from Taiwan (n = 3453) and the United States (n = 3685), predictors of periodontitis were extracted from the datasets and pre‐processed, and then 10 machine learning algorithms were trained to develop predictive models. The models were validated both internally (bootstrap sampling) and externally (alternative country's dataset). The algorithms were compared across six performance metrics ([i] area under the curve for the receiver operating characteristic [AUC], [ii] accuracy, [iii] sensitivity, [iv] specificity, [v] positive predictive value, and [vi] negative predictive value) and two methods of data pre‐processing ([i] machine‐learning‐based feature selection and [ii] dimensionality reduction into principal components). Results Many algorithms showed extremely strong performance during internal validation (AUC > 0.95, accuracy > 95%). However, this was not replicated in external validation, where predictive performance of all algorithms dropped off drastically. Furthermore, predictive performance differed according to data pre‐processing methodology and the cohort on which they were trained. Conclusions Larger sample sizes and more complex predictors of periodontitis are required before machine learning can be leveraged to its full potential.</abstract><cop>Oxford, UK</cop><pub>Blackwell Publishing Ltd</pub><pmid>35781722</pmid><doi>10.1111/jcpe.13692</doi><tpages>12</tpages><orcidid>https://orcid.org/0000-0001-7416-7610</orcidid><orcidid>https://orcid.org/0000-0001-9750-3015</orcidid><oa>free_for_read</oa></addata></record>
fulltext	fulltext
identifier	ISSN: 0303-6979
ispartof	Journal of clinical periodontology, 2022-10, Vol.49 (10), p.958-969
issn	0303-6979 1600-051X
language	eng
recordid	cdi_pubmedcentral_primary_oai_pubmedcentral_nih_gov_9796669
source	Wiley-Blackwell Read & Publish Collection
subjects	Algorithms computing Diagnosis, Epidemiology and Associated Co‐morbidities Gum disease Humans Learning algorithms Machine Learning Original Periodontitis Periodontitis - diagnosis Prediction models predictive modelling Predictive Value of Tests ROC Curve statistics
title	Systematic comparison of machine learning algorithms to develop and validate predictive models for periodontitis
url	http://sfxeu10.hosted.exlibrisgroup.com/loughborough?ctx_ver=Z39.88-2004&ctx_enc=info:ofi/enc:UTF-8&ctx_tim=2025-01-01T07%3A55%3A38IST&url_ver=Z39.88-2004&url_ctx_fmt=infofi/fmt:kev:mtx:ctx&rfr_id=info:sid/primo.exlibrisgroup.com:primo3-Article-proquest_pubme&rft_val_fmt=info:ofi/fmt:kev:mtx:journal&rft.genre=article&rft.atitle=Systematic%20comparison%20of%20machine%20learning%20algorithms%20to%20develop%20and%20validate%20predictive%20models%20for%20periodontitis&rft.jtitle=Journal%20of%20clinical%20periodontology&rft.au=Bashir,%20Nasir%20Z.&rft.date=2022-10&rft.volume=49&rft.issue=10&rft.spage=958&rft.epage=969&rft.pages=958-969&rft.issn=0303-6979&rft.eissn=1600-051X&rft_id=info:doi/10.1111/jcpe.13692&rft_dat=%3Cproquest_pubme%3E2685033618%3C/proquest_pubme%3E%3Cgrp_id%3Ecdi_FETCH-LOGICAL-c4482-18b501b0d259f8deb2e7bd3c65c77f7e6a0b7bef923aee576eba88c524f1f9213%3C/grp_id%3E%3Coa%3E%3C/oa%3E%3Curl%3E%3C/url%3E&rft_id=info:oai/&rft_pqid=2715684457&rft_id=info:pmid/35781722&rfr_iscdi=true