Loading…

The application of naive Bayes model averaging to predict Alzheimer's disease from genome-wide data

Predicting patient outcomes from genome-wide measurements holds significant promise for improving clinical care. The large number of measurements (eg, single nucleotide polymorphisms (SNPs)), however, makes this task computationally challenging. This paper evaluates the performance of an algorithm t...

Full description

Saved in:
Bibliographic Details
Published in:Journal of the American Medical Informatics Association : JAMIA 2011-07, Vol.18 (4), p.370-375
Main Authors: Wei, Wei, Visweswaran, Shyam, Cooper, Gregory F
Format: Article
Language:English
Subjects:
Citations: Items that this one cites
Items that cite this one
Online Access:Get full text
Tags: Add Tag
No Tags, Be the first to tag this record!
cited_by cdi_FETCH-LOGICAL-c486t-4ef80c9291a329824eb66e92151eae2ce9545fc0c7921bd9bec9e60c5e610043
cites cdi_FETCH-LOGICAL-c486t-4ef80c9291a329824eb66e92151eae2ce9545fc0c7921bd9bec9e60c5e610043
container_end_page 375
container_issue 4
container_start_page 370
container_title Journal of the American Medical Informatics Association : JAMIA
container_volume 18
creator Wei, Wei
Visweswaran, Shyam
Cooper, Gregory F
description Predicting patient outcomes from genome-wide measurements holds significant promise for improving clinical care. The large number of measurements (eg, single nucleotide polymorphisms (SNPs)), however, makes this task computationally challenging. This paper evaluates the performance of an algorithm that predicts patient outcomes from genome-wide data by efficiently model averaging over an exponential number of naive Bayes (NB) models. This model-averaged naive Bayes (MANB) method was applied to predict late onset Alzheimer's disease in 1411 individuals who each had 312,318 SNP measurements available as genome-wide predictive features. Its performance was compared to that of a naive Bayes algorithm without feature selection (NB) and with feature selection (FSNB). Performance of each algorithm was measured in terms of area under the ROC curve (AUC), calibration, and run time. The training time of MANB (16.1 s) was fast like NB (15.6 s), while FSNB (1684.2 s) was considerably slower. Each of the three algorithms required less than 0.1 s to predict the outcome of a test case. MANB had an AUC of 0.72, which is significantly better than the AUC of 0.59 by NB (p
doi_str_mv 10.1136/amiajnl-2011-000101
format article
fullrecord <record><control><sourceid>proquest_pubme</sourceid><recordid>TN_cdi_pubmedcentral_primary_oai_pubmedcentral_nih_gov_3128400</recordid><sourceformat>XML</sourceformat><sourcesystem>PC</sourcesystem><sourcerecordid>872132057</sourcerecordid><originalsourceid>FETCH-LOGICAL-c486t-4ef80c9291a329824eb66e92151eae2ce9545fc0c7921bd9bec9e60c5e610043</originalsourceid><addsrcrecordid>eNqFkU1r3DAQhkVpyfcvKBTdcnIykmXJugTSkKSFQC97yE3MyuNdBdtyJe-W5NfXYbehOeU0w8w7LzPzMPZVwIUQpb7EPuDT0BUShCgAQID4xI5EJU1hjXr8POegTVGBNIfsOOenWaJlWR2wQym0kRbMEfOLNXEcxy54nEIceGz5gGFL_Ds-U-Z9bKjjuKWEqzCs-BT5mKgJfuLX3cuaQk_pPPMmZMJMvE2x5ysaYk_Fn9AQb3DCU_alxS7T2T6esMXd7eLmR_Hw6_7nzfVD4VWtp0JRW4O30gospa2loqXWZKWoBCFJT7ZSVevBm7m2bOySvCUNviItAFR5wq52tuNm2VPjaZgSdm5Mocf07CIG974zhLVbxa0rhawVwGxwvjdI8feG8uT6kD11HQ4UN9lZUKqev2k_VNZGilJCZWZluVP6FHNO1L7tI8C9YnR7jO4Vo9thnKe-_X_K28w_buVfBxecDg</addsrcrecordid><sourcetype>Open Access Repository</sourcetype><iscdi>true</iscdi><recordtype>article</recordtype><pqid>872132057</pqid></control><display><type>article</type><title>The application of naive Bayes model averaging to predict Alzheimer's disease from genome-wide data</title><source>PubMed Central (Open Access)</source><source>Oxford Journals Online</source><creator>Wei, Wei ; Visweswaran, Shyam ; Cooper, Gregory F</creator><creatorcontrib>Wei, Wei ; Visweswaran, Shyam ; Cooper, Gregory F</creatorcontrib><description>Predicting patient outcomes from genome-wide measurements holds significant promise for improving clinical care. The large number of measurements (eg, single nucleotide polymorphisms (SNPs)), however, makes this task computationally challenging. This paper evaluates the performance of an algorithm that predicts patient outcomes from genome-wide data by efficiently model averaging over an exponential number of naive Bayes (NB) models. This model-averaged naive Bayes (MANB) method was applied to predict late onset Alzheimer's disease in 1411 individuals who each had 312,318 SNP measurements available as genome-wide predictive features. Its performance was compared to that of a naive Bayes algorithm without feature selection (NB) and with feature selection (FSNB). Performance of each algorithm was measured in terms of area under the ROC curve (AUC), calibration, and run time. The training time of MANB (16.1 s) was fast like NB (15.6 s), while FSNB (1684.2 s) was considerably slower. Each of the three algorithms required less than 0.1 s to predict the outcome of a test case. MANB had an AUC of 0.72, which is significantly better than the AUC of 0.59 by NB (p&lt;0.00001), but not significantly different from the AUC of 0.71 by FSNB. MANB was better calibrated than NB, and FSNB was even better in calibration. A limitation was that only one dataset and two comparison algorithms were included in this study. MANB performed comparatively well in predicting a clinical outcome from a high-dimensional genome-wide dataset. These results provide support for including MANB in the methods used to predict outcomes from large, genome-wide datasets.</description><identifier>ISSN: 1067-5027</identifier><identifier>EISSN: 1527-974X</identifier><identifier>DOI: 10.1136/amiajnl-2011-000101</identifier><identifier>PMID: 21672907</identifier><language>eng</language><publisher>England: BMJ Group</publisher><subject>Aged ; Aged, 80 and over ; Algorithms ; Alzheimer Disease - diagnosis ; Alzheimer Disease - genetics ; Apolipoproteins E - genetics ; Artificial Intelligence ; Bayes Theorem ; Case-Control Studies ; Genome-Wide Association Study ; Humans ; Models, Genetic ; Polymorphism, Single Nucleotide ; Prognosis ; Research and Applications ; ROC Curve</subject><ispartof>Journal of the American Medical Informatics Association : JAMIA, 2011-07, Vol.18 (4), p.370-375</ispartof><rights>2011, Published by the BMJ Publishing Group Limited. For permission to use (where not already granted under a licence) please go to http://group.bmj.com/group/rights-licensing/permissions. 2011</rights><lds50>peer_reviewed</lds50><oa>free_for_read</oa><woscitedreferencessubscribed>false</woscitedreferencessubscribed><citedby>FETCH-LOGICAL-c486t-4ef80c9291a329824eb66e92151eae2ce9545fc0c7921bd9bec9e60c5e610043</citedby><cites>FETCH-LOGICAL-c486t-4ef80c9291a329824eb66e92151eae2ce9545fc0c7921bd9bec9e60c5e610043</cites></display><links><openurl>$$Topenurl_article</openurl><openurlfulltext>$$Topenurlfull_article</openurlfulltext><thumbnail>$$Tsyndetics_thumb_exl</thumbnail><linktopdf>$$Uhttps://www.ncbi.nlm.nih.gov/pmc/articles/PMC3128400/pdf/$$EPDF$$P50$$Gpubmedcentral$$H</linktopdf><linktohtml>$$Uhttps://www.ncbi.nlm.nih.gov/pmc/articles/PMC3128400/$$EHTML$$P50$$Gpubmedcentral$$H</linktohtml><link.rule.ids>230,314,727,780,784,885,27924,27925,53791,53793</link.rule.ids><backlink>$$Uhttps://www.ncbi.nlm.nih.gov/pubmed/21672907$$D View this record in MEDLINE/PubMed$$Hfree_for_read</backlink></links><search><creatorcontrib>Wei, Wei</creatorcontrib><creatorcontrib>Visweswaran, Shyam</creatorcontrib><creatorcontrib>Cooper, Gregory F</creatorcontrib><title>The application of naive Bayes model averaging to predict Alzheimer's disease from genome-wide data</title><title>Journal of the American Medical Informatics Association : JAMIA</title><addtitle>J Am Med Inform Assoc</addtitle><description>Predicting patient outcomes from genome-wide measurements holds significant promise for improving clinical care. The large number of measurements (eg, single nucleotide polymorphisms (SNPs)), however, makes this task computationally challenging. This paper evaluates the performance of an algorithm that predicts patient outcomes from genome-wide data by efficiently model averaging over an exponential number of naive Bayes (NB) models. This model-averaged naive Bayes (MANB) method was applied to predict late onset Alzheimer's disease in 1411 individuals who each had 312,318 SNP measurements available as genome-wide predictive features. Its performance was compared to that of a naive Bayes algorithm without feature selection (NB) and with feature selection (FSNB). Performance of each algorithm was measured in terms of area under the ROC curve (AUC), calibration, and run time. The training time of MANB (16.1 s) was fast like NB (15.6 s), while FSNB (1684.2 s) was considerably slower. Each of the three algorithms required less than 0.1 s to predict the outcome of a test case. MANB had an AUC of 0.72, which is significantly better than the AUC of 0.59 by NB (p&lt;0.00001), but not significantly different from the AUC of 0.71 by FSNB. MANB was better calibrated than NB, and FSNB was even better in calibration. A limitation was that only one dataset and two comparison algorithms were included in this study. MANB performed comparatively well in predicting a clinical outcome from a high-dimensional genome-wide dataset. These results provide support for including MANB in the methods used to predict outcomes from large, genome-wide datasets.</description><subject>Aged</subject><subject>Aged, 80 and over</subject><subject>Algorithms</subject><subject>Alzheimer Disease - diagnosis</subject><subject>Alzheimer Disease - genetics</subject><subject>Apolipoproteins E - genetics</subject><subject>Artificial Intelligence</subject><subject>Bayes Theorem</subject><subject>Case-Control Studies</subject><subject>Genome-Wide Association Study</subject><subject>Humans</subject><subject>Models, Genetic</subject><subject>Polymorphism, Single Nucleotide</subject><subject>Prognosis</subject><subject>Research and Applications</subject><subject>ROC Curve</subject><issn>1067-5027</issn><issn>1527-974X</issn><fulltext>true</fulltext><rsrctype>article</rsrctype><creationdate>2011</creationdate><recordtype>article</recordtype><recordid>eNqFkU1r3DAQhkVpyfcvKBTdcnIykmXJugTSkKSFQC97yE3MyuNdBdtyJe-W5NfXYbehOeU0w8w7LzPzMPZVwIUQpb7EPuDT0BUShCgAQID4xI5EJU1hjXr8POegTVGBNIfsOOenWaJlWR2wQym0kRbMEfOLNXEcxy54nEIceGz5gGFL_Ds-U-Z9bKjjuKWEqzCs-BT5mKgJfuLX3cuaQk_pPPMmZMJMvE2x5ysaYk_Fn9AQb3DCU_alxS7T2T6esMXd7eLmR_Hw6_7nzfVD4VWtp0JRW4O30gospa2loqXWZKWoBCFJT7ZSVevBm7m2bOySvCUNviItAFR5wq52tuNm2VPjaZgSdm5Mocf07CIG974zhLVbxa0rhawVwGxwvjdI8feG8uT6kD11HQ4UN9lZUKqev2k_VNZGilJCZWZluVP6FHNO1L7tI8C9YnR7jO4Vo9thnKe-_X_K28w_buVfBxecDg</recordid><startdate>20110701</startdate><enddate>20110701</enddate><creator>Wei, Wei</creator><creator>Visweswaran, Shyam</creator><creator>Cooper, Gregory F</creator><general>BMJ Group</general><scope>CGR</scope><scope>CUY</scope><scope>CVF</scope><scope>ECM</scope><scope>EIF</scope><scope>NPM</scope><scope>AAYXX</scope><scope>CITATION</scope><scope>7X8</scope><scope>7QO</scope><scope>8FD</scope><scope>FR3</scope><scope>P64</scope><scope>5PM</scope></search><sort><creationdate>20110701</creationdate><title>The application of naive Bayes model averaging to predict Alzheimer's disease from genome-wide data</title><author>Wei, Wei ; Visweswaran, Shyam ; Cooper, Gregory F</author></sort><facets><frbrtype>5</frbrtype><frbrgroupid>cdi_FETCH-LOGICAL-c486t-4ef80c9291a329824eb66e92151eae2ce9545fc0c7921bd9bec9e60c5e610043</frbrgroupid><rsrctype>articles</rsrctype><prefilter>articles</prefilter><language>eng</language><creationdate>2011</creationdate><topic>Aged</topic><topic>Aged, 80 and over</topic><topic>Algorithms</topic><topic>Alzheimer Disease - diagnosis</topic><topic>Alzheimer Disease - genetics</topic><topic>Apolipoproteins E - genetics</topic><topic>Artificial Intelligence</topic><topic>Bayes Theorem</topic><topic>Case-Control Studies</topic><topic>Genome-Wide Association Study</topic><topic>Humans</topic><topic>Models, Genetic</topic><topic>Polymorphism, Single Nucleotide</topic><topic>Prognosis</topic><topic>Research and Applications</topic><topic>ROC Curve</topic><toplevel>peer_reviewed</toplevel><toplevel>online_resources</toplevel><creatorcontrib>Wei, Wei</creatorcontrib><creatorcontrib>Visweswaran, Shyam</creatorcontrib><creatorcontrib>Cooper, Gregory F</creatorcontrib><collection>Medline</collection><collection>MEDLINE</collection><collection>MEDLINE (Ovid)</collection><collection>MEDLINE</collection><collection>MEDLINE</collection><collection>PubMed</collection><collection>CrossRef</collection><collection>MEDLINE - Academic</collection><collection>Biotechnology Research Abstracts</collection><collection>Technology Research Database</collection><collection>Engineering Research Database</collection><collection>Biotechnology and BioEngineering Abstracts</collection><collection>PubMed Central (Full Participant titles)</collection><jtitle>Journal of the American Medical Informatics Association : JAMIA</jtitle></facets><delivery><delcategory>Remote Search Resource</delcategory><fulltext>fulltext</fulltext></delivery><addata><au>Wei, Wei</au><au>Visweswaran, Shyam</au><au>Cooper, Gregory F</au><format>journal</format><genre>article</genre><ristype>JOUR</ristype><atitle>The application of naive Bayes model averaging to predict Alzheimer's disease from genome-wide data</atitle><jtitle>Journal of the American Medical Informatics Association : JAMIA</jtitle><addtitle>J Am Med Inform Assoc</addtitle><date>2011-07-01</date><risdate>2011</risdate><volume>18</volume><issue>4</issue><spage>370</spage><epage>375</epage><pages>370-375</pages><issn>1067-5027</issn><eissn>1527-974X</eissn><abstract>Predicting patient outcomes from genome-wide measurements holds significant promise for improving clinical care. The large number of measurements (eg, single nucleotide polymorphisms (SNPs)), however, makes this task computationally challenging. This paper evaluates the performance of an algorithm that predicts patient outcomes from genome-wide data by efficiently model averaging over an exponential number of naive Bayes (NB) models. This model-averaged naive Bayes (MANB) method was applied to predict late onset Alzheimer's disease in 1411 individuals who each had 312,318 SNP measurements available as genome-wide predictive features. Its performance was compared to that of a naive Bayes algorithm without feature selection (NB) and with feature selection (FSNB). Performance of each algorithm was measured in terms of area under the ROC curve (AUC), calibration, and run time. The training time of MANB (16.1 s) was fast like NB (15.6 s), while FSNB (1684.2 s) was considerably slower. Each of the three algorithms required less than 0.1 s to predict the outcome of a test case. MANB had an AUC of 0.72, which is significantly better than the AUC of 0.59 by NB (p&lt;0.00001), but not significantly different from the AUC of 0.71 by FSNB. MANB was better calibrated than NB, and FSNB was even better in calibration. A limitation was that only one dataset and two comparison algorithms were included in this study. MANB performed comparatively well in predicting a clinical outcome from a high-dimensional genome-wide dataset. These results provide support for including MANB in the methods used to predict outcomes from large, genome-wide datasets.</abstract><cop>England</cop><pub>BMJ Group</pub><pmid>21672907</pmid><doi>10.1136/amiajnl-2011-000101</doi><tpages>6</tpages><oa>free_for_read</oa></addata></record>
fulltext fulltext
identifier ISSN: 1067-5027
ispartof Journal of the American Medical Informatics Association : JAMIA, 2011-07, Vol.18 (4), p.370-375
issn 1067-5027
1527-974X
language eng
recordid cdi_pubmedcentral_primary_oai_pubmedcentral_nih_gov_3128400
source PubMed Central (Open Access); Oxford Journals Online
subjects Aged
Aged, 80 and over
Algorithms
Alzheimer Disease - diagnosis
Alzheimer Disease - genetics
Apolipoproteins E - genetics
Artificial Intelligence
Bayes Theorem
Case-Control Studies
Genome-Wide Association Study
Humans
Models, Genetic
Polymorphism, Single Nucleotide
Prognosis
Research and Applications
ROC Curve
title The application of naive Bayes model averaging to predict Alzheimer's disease from genome-wide data
url http://sfxeu10.hosted.exlibrisgroup.com/loughborough?ctx_ver=Z39.88-2004&ctx_enc=info:ofi/enc:UTF-8&ctx_tim=2024-12-19T06%3A20%3A29IST&url_ver=Z39.88-2004&url_ctx_fmt=infofi/fmt:kev:mtx:ctx&rfr_id=info:sid/primo.exlibrisgroup.com:primo3-Article-proquest_pubme&rft_val_fmt=info:ofi/fmt:kev:mtx:journal&rft.genre=article&rft.atitle=The%20application%20of%20naive%20Bayes%20model%20averaging%20to%20predict%20Alzheimer's%20disease%20from%20genome-wide%20data&rft.jtitle=Journal%20of%20the%20American%20Medical%20Informatics%20Association%20:%20JAMIA&rft.au=Wei,%20Wei&rft.date=2011-07-01&rft.volume=18&rft.issue=4&rft.spage=370&rft.epage=375&rft.pages=370-375&rft.issn=1067-5027&rft.eissn=1527-974X&rft_id=info:doi/10.1136/amiajnl-2011-000101&rft_dat=%3Cproquest_pubme%3E872132057%3C/proquest_pubme%3E%3Cgrp_id%3Ecdi_FETCH-LOGICAL-c486t-4ef80c9291a329824eb66e92151eae2ce9545fc0c7921bd9bec9e60c5e610043%3C/grp_id%3E%3Coa%3E%3C/oa%3E%3Curl%3E%3C/url%3E&rft_id=info:oai/&rft_pqid=872132057&rft_id=info:pmid/21672907&rfr_iscdi=true