Loading…

Deep Bayesian Gaussian processes for uncertainty estimation in electronic health records

One major impediment to the wider use of deep learning for clinical decision making is the difficulty of assigning a level of confidence to model predictions. Currently, deep Bayesian neural networks and sparse Gaussian processes are the main two scalable uncertainty estimation methods. However, dee...

Full description

Saved in:
Bibliographic Details
Published in:Scientific reports 2021-10, Vol.11 (1), p.20685-20685, Article 20685
Main Authors: Li, Yikuan, Rao, Shishir, Hassaine, Abdelaali, Ramakrishnan, Rema, Canoy, Dexter, Salimi-Khorshidi, Gholamreza, Mamouei, Mohammad, Lukasiewicz, Thomas, Rahimi, Kazem
Format: Article
Language:English
Subjects:
Citations: Items that this one cites
Items that cite this one
Online Access:Get full text
Tags: Add Tag
No Tags, Be the first to tag this record!
cited_by cdi_FETCH-LOGICAL-c540t-766bf75b5e02481476e21ed8e4fff015e8dd1dbbc5adab45f8d70d2d161906643
cites cdi_FETCH-LOGICAL-c540t-766bf75b5e02481476e21ed8e4fff015e8dd1dbbc5adab45f8d70d2d161906643
container_end_page 20685
container_issue 1
container_start_page 20685
container_title Scientific reports
container_volume 11
creator Li, Yikuan
Rao, Shishir
Hassaine, Abdelaali
Ramakrishnan, Rema
Canoy, Dexter
Salimi-Khorshidi, Gholamreza
Mamouei, Mohammad
Lukasiewicz, Thomas
Rahimi, Kazem
description One major impediment to the wider use of deep learning for clinical decision making is the difficulty of assigning a level of confidence to model predictions. Currently, deep Bayesian neural networks and sparse Gaussian processes are the main two scalable uncertainty estimation methods. However, deep Bayesian neural networks suffer from lack of expressiveness, and more expressive models such as deep kernel learning, which is an extension of sparse Gaussian process, captures only the uncertainty from the higher-level latent space. Therefore, the deep learning model under it lacks interpretability and ignores uncertainty from the raw data. In this paper, we merge features of the deep Bayesian learning framework with deep kernel learning to leverage the strengths of both methods for a more comprehensive uncertainty estimation. Through a series of experiments on predicting the first incidence of heart failure, diabetes and depression applied to large-scale electronic medical records, we demonstrate that our method is better at capturing uncertainty than both Gaussian processes and deep Bayesian neural networks in terms of indicating data insufficiency and identifying misclassifications, with a comparable generalization performance. Furthermore, by assessing the accuracy and area under the receiver operating characteristic curve over the predictive probability, we show that our method is less susceptible to making overconfident predictions, especially for the minority class in imbalanced datasets. Finally, we demonstrate how uncertainty information derived by the model can inform risk factor analysis towards model interpretability.
doi_str_mv 10.1038/s41598-021-00144-6
format article
fullrecord <record><control><sourceid>proquest_doaj_</sourceid><recordid>TN_cdi_doaj_primary_oai_doaj_org_article_bc79d9bbee154a139edf7d64fff402f4</recordid><sourceformat>XML</sourceformat><sourcesystem>PC</sourcesystem><doaj_id>oai_doaj_org_article_bc79d9bbee154a139edf7d64fff402f4</doaj_id><sourcerecordid>2584013281</sourcerecordid><originalsourceid>FETCH-LOGICAL-c540t-766bf75b5e02481476e21ed8e4fff015e8dd1dbbc5adab45f8d70d2d161906643</originalsourceid><addsrcrecordid>eNp9kk1vFDEMhkcIRKvSP8ABjcSFy5Qkk2SSCxIUKJUqcQGJW5QPZzer2WRJZirtvye705aWA7nEih-_dmw3zWuMLjDqxftCMZOiQwR3CGFKO_6sOSWIso70hDx_ZJ8056VsUD2MSIrly-akp5wPBKHT5tdngF37Se-hBB3bKz2Xo7HLyUIpUFqfcjtHC3nSIU77FsoUtnoKKbYhtjCCnXKKwbZr0OO0bjPYlF151bzweixwfnefNT-_fvlx-a27-X51ffnxprOMoqkbODd-YIYBIlRgOnAgGJwA6r1HmIFwDjtjLNNOG8q8cANyxGGOJeKc9mfN9aLrkt6oXa615b1KOqjjQ8orpfMU7AjK2EE6aQwAZlTjXoLzg-OHTBQRf9D6sGjtZrMFZyFOWY9PRJ96YlirVbpVghEuelIF3t0J5PR7rp1S21AsjKOOkOaiCBMU4Z4IXNG3_6CbNOdYW3Wg6tgkkahSZKFsTqVk8A_FYKQOe6CWPVB1D9RxDxSvQW8ef-Mh5H7qFegXoFRXXEH-m_s_sn8AEAC_kQ</addsrcrecordid><sourcetype>Open Website</sourcetype><iscdi>true</iscdi><recordtype>article</recordtype><pqid>2583229290</pqid></control><display><type>article</type><title>Deep Bayesian Gaussian processes for uncertainty estimation in electronic health records</title><source>PubMed Central(OpenAccess)</source><source>ProQuest - Publicly Available Content Database</source><source>Free Full-Text Journals in Chemistry</source><source>Springer Nature - nature.com Journals - Fully Open Access</source><creator>Li, Yikuan ; Rao, Shishir ; Hassaine, Abdelaali ; Ramakrishnan, Rema ; Canoy, Dexter ; Salimi-Khorshidi, Gholamreza ; Mamouei, Mohammad ; Lukasiewicz, Thomas ; Rahimi, Kazem</creator><creatorcontrib>Li, Yikuan ; Rao, Shishir ; Hassaine, Abdelaali ; Ramakrishnan, Rema ; Canoy, Dexter ; Salimi-Khorshidi, Gholamreza ; Mamouei, Mohammad ; Lukasiewicz, Thomas ; Rahimi, Kazem</creatorcontrib><description>One major impediment to the wider use of deep learning for clinical decision making is the difficulty of assigning a level of confidence to model predictions. Currently, deep Bayesian neural networks and sparse Gaussian processes are the main two scalable uncertainty estimation methods. However, deep Bayesian neural networks suffer from lack of expressiveness, and more expressive models such as deep kernel learning, which is an extension of sparse Gaussian process, captures only the uncertainty from the higher-level latent space. Therefore, the deep learning model under it lacks interpretability and ignores uncertainty from the raw data. In this paper, we merge features of the deep Bayesian learning framework with deep kernel learning to leverage the strengths of both methods for a more comprehensive uncertainty estimation. Through a series of experiments on predicting the first incidence of heart failure, diabetes and depression applied to large-scale electronic medical records, we demonstrate that our method is better at capturing uncertainty than both Gaussian processes and deep Bayesian neural networks in terms of indicating data insufficiency and identifying misclassifications, with a comparable generalization performance. Furthermore, by assessing the accuracy and area under the receiver operating characteristic curve over the predictive probability, we show that our method is less susceptible to making overconfident predictions, especially for the minority class in imbalanced datasets. Finally, we demonstrate how uncertainty information derived by the model can inform risk factor analysis towards model interpretability.</description><identifier>ISSN: 2045-2322</identifier><identifier>EISSN: 2045-2322</identifier><identifier>DOI: 10.1038/s41598-021-00144-6</identifier><identifier>PMID: 34667200</identifier><language>eng</language><publisher>London: Nature Publishing Group UK</publisher><subject>639/705/117 ; 692/699/75 ; Bayesian analysis ; Congestive heart failure ; Decision making ; Deep learning ; Diabetes mellitus ; Electronic medical records ; Factor analysis ; Humanities and Social Sciences ; Mathematical models ; multidisciplinary ; Neural networks ; Risk factors ; Science ; Science (multidisciplinary)</subject><ispartof>Scientific reports, 2021-10, Vol.11 (1), p.20685-20685, Article 20685</ispartof><rights>The Author(s) 2021. corrected publication 2021</rights><rights>2021. The Author(s).</rights><rights>The Author(s) 2021. This work is published under http://creativecommons.org/licenses/by/4.0/ (the “License”). Notwithstanding the ProQuest Terms and Conditions, you may use this content in accordance with the terms of the License.</rights><rights>The Author(s) 2021, corrected publication 2021</rights><lds50>peer_reviewed</lds50><oa>free_for_read</oa><woscitedreferencessubscribed>false</woscitedreferencessubscribed><citedby>FETCH-LOGICAL-c540t-766bf75b5e02481476e21ed8e4fff015e8dd1dbbc5adab45f8d70d2d161906643</citedby><cites>FETCH-LOGICAL-c540t-766bf75b5e02481476e21ed8e4fff015e8dd1dbbc5adab45f8d70d2d161906643</cites></display><links><openurl>$$Topenurl_article</openurl><openurlfulltext>$$Topenurlfull_article</openurlfulltext><thumbnail>$$Tsyndetics_thumb_exl</thumbnail><linktopdf>$$Uhttps://www.proquest.com/docview/2583229290/fulltextPDF?pq-origsite=primo$$EPDF$$P50$$Gproquest$$Hfree_for_read</linktopdf><linktohtml>$$Uhttps://www.proquest.com/docview/2583229290?pq-origsite=primo$$EHTML$$P50$$Gproquest$$Hfree_for_read</linktohtml><link.rule.ids>230,314,727,780,784,885,25753,27924,27925,37012,37013,44590,53791,53793,75126</link.rule.ids><backlink>$$Uhttps://www.ncbi.nlm.nih.gov/pubmed/34667200$$D View this record in MEDLINE/PubMed$$Hfree_for_read</backlink></links><search><creatorcontrib>Li, Yikuan</creatorcontrib><creatorcontrib>Rao, Shishir</creatorcontrib><creatorcontrib>Hassaine, Abdelaali</creatorcontrib><creatorcontrib>Ramakrishnan, Rema</creatorcontrib><creatorcontrib>Canoy, Dexter</creatorcontrib><creatorcontrib>Salimi-Khorshidi, Gholamreza</creatorcontrib><creatorcontrib>Mamouei, Mohammad</creatorcontrib><creatorcontrib>Lukasiewicz, Thomas</creatorcontrib><creatorcontrib>Rahimi, Kazem</creatorcontrib><title>Deep Bayesian Gaussian processes for uncertainty estimation in electronic health records</title><title>Scientific reports</title><addtitle>Sci Rep</addtitle><addtitle>Sci Rep</addtitle><description>One major impediment to the wider use of deep learning for clinical decision making is the difficulty of assigning a level of confidence to model predictions. Currently, deep Bayesian neural networks and sparse Gaussian processes are the main two scalable uncertainty estimation methods. However, deep Bayesian neural networks suffer from lack of expressiveness, and more expressive models such as deep kernel learning, which is an extension of sparse Gaussian process, captures only the uncertainty from the higher-level latent space. Therefore, the deep learning model under it lacks interpretability and ignores uncertainty from the raw data. In this paper, we merge features of the deep Bayesian learning framework with deep kernel learning to leverage the strengths of both methods for a more comprehensive uncertainty estimation. Through a series of experiments on predicting the first incidence of heart failure, diabetes and depression applied to large-scale electronic medical records, we demonstrate that our method is better at capturing uncertainty than both Gaussian processes and deep Bayesian neural networks in terms of indicating data insufficiency and identifying misclassifications, with a comparable generalization performance. Furthermore, by assessing the accuracy and area under the receiver operating characteristic curve over the predictive probability, we show that our method is less susceptible to making overconfident predictions, especially for the minority class in imbalanced datasets. Finally, we demonstrate how uncertainty information derived by the model can inform risk factor analysis towards model interpretability.</description><subject>639/705/117</subject><subject>692/699/75</subject><subject>Bayesian analysis</subject><subject>Congestive heart failure</subject><subject>Decision making</subject><subject>Deep learning</subject><subject>Diabetes mellitus</subject><subject>Electronic medical records</subject><subject>Factor analysis</subject><subject>Humanities and Social Sciences</subject><subject>Mathematical models</subject><subject>multidisciplinary</subject><subject>Neural networks</subject><subject>Risk factors</subject><subject>Science</subject><subject>Science (multidisciplinary)</subject><issn>2045-2322</issn><issn>2045-2322</issn><fulltext>true</fulltext><rsrctype>article</rsrctype><creationdate>2021</creationdate><recordtype>article</recordtype><sourceid>PIMPY</sourceid><sourceid>DOA</sourceid><recordid>eNp9kk1vFDEMhkcIRKvSP8ABjcSFy5Qkk2SSCxIUKJUqcQGJW5QPZzer2WRJZirtvye705aWA7nEih-_dmw3zWuMLjDqxftCMZOiQwR3CGFKO_6sOSWIso70hDx_ZJ8056VsUD2MSIrly-akp5wPBKHT5tdngF37Se-hBB3bKz2Xo7HLyUIpUFqfcjtHC3nSIU77FsoUtnoKKbYhtjCCnXKKwbZr0OO0bjPYlF151bzweixwfnefNT-_fvlx-a27-X51ffnxprOMoqkbODd-YIYBIlRgOnAgGJwA6r1HmIFwDjtjLNNOG8q8cANyxGGOJeKc9mfN9aLrkt6oXa615b1KOqjjQ8orpfMU7AjK2EE6aQwAZlTjXoLzg-OHTBQRf9D6sGjtZrMFZyFOWY9PRJ96YlirVbpVghEuelIF3t0J5PR7rp1S21AsjKOOkOaiCBMU4Z4IXNG3_6CbNOdYW3Wg6tgkkahSZKFsTqVk8A_FYKQOe6CWPVB1D9RxDxSvQW8ef-Mh5H7qFegXoFRXXEH-m_s_sn8AEAC_kQ</recordid><startdate>20211019</startdate><enddate>20211019</enddate><creator>Li, Yikuan</creator><creator>Rao, Shishir</creator><creator>Hassaine, Abdelaali</creator><creator>Ramakrishnan, Rema</creator><creator>Canoy, Dexter</creator><creator>Salimi-Khorshidi, Gholamreza</creator><creator>Mamouei, Mohammad</creator><creator>Lukasiewicz, Thomas</creator><creator>Rahimi, Kazem</creator><general>Nature Publishing Group UK</general><general>Nature Publishing Group</general><general>Nature Portfolio</general><scope>C6C</scope><scope>NPM</scope><scope>AAYXX</scope><scope>CITATION</scope><scope>3V.</scope><scope>7X7</scope><scope>7XB</scope><scope>88A</scope><scope>88E</scope><scope>88I</scope><scope>8FE</scope><scope>8FH</scope><scope>8FI</scope><scope>8FJ</scope><scope>8FK</scope><scope>ABUWG</scope><scope>AFKRA</scope><scope>AZQEC</scope><scope>BBNVY</scope><scope>BENPR</scope><scope>BHPHI</scope><scope>CCPQU</scope><scope>DWQXO</scope><scope>FYUFA</scope><scope>GHDGH</scope><scope>GNUQQ</scope><scope>HCIFZ</scope><scope>K9.</scope><scope>LK8</scope><scope>M0S</scope><scope>M1P</scope><scope>M2P</scope><scope>M7P</scope><scope>PIMPY</scope><scope>PQEST</scope><scope>PQQKQ</scope><scope>PQUKI</scope><scope>Q9U</scope><scope>7X8</scope><scope>5PM</scope><scope>DOA</scope></search><sort><creationdate>20211019</creationdate><title>Deep Bayesian Gaussian processes for uncertainty estimation in electronic health records</title><author>Li, Yikuan ; Rao, Shishir ; Hassaine, Abdelaali ; Ramakrishnan, Rema ; Canoy, Dexter ; Salimi-Khorshidi, Gholamreza ; Mamouei, Mohammad ; Lukasiewicz, Thomas ; Rahimi, Kazem</author></sort><facets><frbrtype>5</frbrtype><frbrgroupid>cdi_FETCH-LOGICAL-c540t-766bf75b5e02481476e21ed8e4fff015e8dd1dbbc5adab45f8d70d2d161906643</frbrgroupid><rsrctype>articles</rsrctype><prefilter>articles</prefilter><language>eng</language><creationdate>2021</creationdate><topic>639/705/117</topic><topic>692/699/75</topic><topic>Bayesian analysis</topic><topic>Congestive heart failure</topic><topic>Decision making</topic><topic>Deep learning</topic><topic>Diabetes mellitus</topic><topic>Electronic medical records</topic><topic>Factor analysis</topic><topic>Humanities and Social Sciences</topic><topic>Mathematical models</topic><topic>multidisciplinary</topic><topic>Neural networks</topic><topic>Risk factors</topic><topic>Science</topic><topic>Science (multidisciplinary)</topic><toplevel>peer_reviewed</toplevel><toplevel>online_resources</toplevel><creatorcontrib>Li, Yikuan</creatorcontrib><creatorcontrib>Rao, Shishir</creatorcontrib><creatorcontrib>Hassaine, Abdelaali</creatorcontrib><creatorcontrib>Ramakrishnan, Rema</creatorcontrib><creatorcontrib>Canoy, Dexter</creatorcontrib><creatorcontrib>Salimi-Khorshidi, Gholamreza</creatorcontrib><creatorcontrib>Mamouei, Mohammad</creatorcontrib><creatorcontrib>Lukasiewicz, Thomas</creatorcontrib><creatorcontrib>Rahimi, Kazem</creatorcontrib><collection>Springer Nature OA Free Journals</collection><collection>PubMed</collection><collection>CrossRef</collection><collection>ProQuest Central (Corporate)</collection><collection>ProQuest Health and Medical</collection><collection>ProQuest Central (purchase pre-March 2016)</collection><collection>Biology Database (Alumni Edition)</collection><collection>Medical Database (Alumni Edition)</collection><collection>Science Database (Alumni Edition)</collection><collection>ProQuest SciTech Collection</collection><collection>ProQuest Natural Science Collection</collection><collection>Hospital Premium Collection</collection><collection>Hospital Premium Collection (Alumni Edition)</collection><collection>ProQuest Central (Alumni) (purchase pre-March 2016)</collection><collection>ProQuest Central (Alumni)</collection><collection>ProQuest Central</collection><collection>ProQuest Central Essentials</collection><collection>Biological Science Collection</collection><collection>AUTh Library subscriptions: ProQuest Central</collection><collection>ProQuest Natural Science Collection</collection><collection>ProQuest One Community College</collection><collection>ProQuest Central Korea</collection><collection>Health Research Premium Collection</collection><collection>Health Research Premium Collection (Alumni)</collection><collection>ProQuest Central Student</collection><collection>SciTech Premium Collection</collection><collection>ProQuest Health &amp; Medical Complete (Alumni)</collection><collection>ProQuest Biological Science Collection</collection><collection>Health &amp; Medical Collection (Alumni Edition)</collection><collection>Medical Database</collection><collection>ProQuest Science Journals</collection><collection>ProQuest Biological Science Journals</collection><collection>ProQuest - Publicly Available Content Database</collection><collection>ProQuest One Academic Eastern Edition (DO NOT USE)</collection><collection>ProQuest One Academic</collection><collection>ProQuest One Academic UKI Edition</collection><collection>ProQuest Central Basic</collection><collection>MEDLINE - Academic</collection><collection>PubMed Central (Full Participant titles)</collection><collection>DOAJ Directory of Open Access Journals</collection><jtitle>Scientific reports</jtitle></facets><delivery><delcategory>Remote Search Resource</delcategory><fulltext>fulltext</fulltext></delivery><addata><au>Li, Yikuan</au><au>Rao, Shishir</au><au>Hassaine, Abdelaali</au><au>Ramakrishnan, Rema</au><au>Canoy, Dexter</au><au>Salimi-Khorshidi, Gholamreza</au><au>Mamouei, Mohammad</au><au>Lukasiewicz, Thomas</au><au>Rahimi, Kazem</au><format>journal</format><genre>article</genre><ristype>JOUR</ristype><atitle>Deep Bayesian Gaussian processes for uncertainty estimation in electronic health records</atitle><jtitle>Scientific reports</jtitle><stitle>Sci Rep</stitle><addtitle>Sci Rep</addtitle><date>2021-10-19</date><risdate>2021</risdate><volume>11</volume><issue>1</issue><spage>20685</spage><epage>20685</epage><pages>20685-20685</pages><artnum>20685</artnum><issn>2045-2322</issn><eissn>2045-2322</eissn><abstract>One major impediment to the wider use of deep learning for clinical decision making is the difficulty of assigning a level of confidence to model predictions. Currently, deep Bayesian neural networks and sparse Gaussian processes are the main two scalable uncertainty estimation methods. However, deep Bayesian neural networks suffer from lack of expressiveness, and more expressive models such as deep kernel learning, which is an extension of sparse Gaussian process, captures only the uncertainty from the higher-level latent space. Therefore, the deep learning model under it lacks interpretability and ignores uncertainty from the raw data. In this paper, we merge features of the deep Bayesian learning framework with deep kernel learning to leverage the strengths of both methods for a more comprehensive uncertainty estimation. Through a series of experiments on predicting the first incidence of heart failure, diabetes and depression applied to large-scale electronic medical records, we demonstrate that our method is better at capturing uncertainty than both Gaussian processes and deep Bayesian neural networks in terms of indicating data insufficiency and identifying misclassifications, with a comparable generalization performance. Furthermore, by assessing the accuracy and area under the receiver operating characteristic curve over the predictive probability, we show that our method is less susceptible to making overconfident predictions, especially for the minority class in imbalanced datasets. Finally, we demonstrate how uncertainty information derived by the model can inform risk factor analysis towards model interpretability.</abstract><cop>London</cop><pub>Nature Publishing Group UK</pub><pmid>34667200</pmid><doi>10.1038/s41598-021-00144-6</doi><tpages>1</tpages><oa>free_for_read</oa></addata></record>
fulltext fulltext
identifier ISSN: 2045-2322
ispartof Scientific reports, 2021-10, Vol.11 (1), p.20685-20685, Article 20685
issn 2045-2322
2045-2322
language eng
recordid cdi_doaj_primary_oai_doaj_org_article_bc79d9bbee154a139edf7d64fff402f4
source PubMed Central(OpenAccess); ProQuest - Publicly Available Content Database; Free Full-Text Journals in Chemistry; Springer Nature - nature.com Journals - Fully Open Access
subjects 639/705/117
692/699/75
Bayesian analysis
Congestive heart failure
Decision making
Deep learning
Diabetes mellitus
Electronic medical records
Factor analysis
Humanities and Social Sciences
Mathematical models
multidisciplinary
Neural networks
Risk factors
Science
Science (multidisciplinary)
title Deep Bayesian Gaussian processes for uncertainty estimation in electronic health records
url http://sfxeu10.hosted.exlibrisgroup.com/loughborough?ctx_ver=Z39.88-2004&ctx_enc=info:ofi/enc:UTF-8&ctx_tim=2024-12-29T12%3A11%3A16IST&url_ver=Z39.88-2004&url_ctx_fmt=infofi/fmt:kev:mtx:ctx&rfr_id=info:sid/primo.exlibrisgroup.com:primo3-Article-proquest_doaj_&rft_val_fmt=info:ofi/fmt:kev:mtx:journal&rft.genre=article&rft.atitle=Deep%20Bayesian%20Gaussian%20processes%20for%20uncertainty%20estimation%20in%20electronic%20health%20records&rft.jtitle=Scientific%20reports&rft.au=Li,%20Yikuan&rft.date=2021-10-19&rft.volume=11&rft.issue=1&rft.spage=20685&rft.epage=20685&rft.pages=20685-20685&rft.artnum=20685&rft.issn=2045-2322&rft.eissn=2045-2322&rft_id=info:doi/10.1038/s41598-021-00144-6&rft_dat=%3Cproquest_doaj_%3E2584013281%3C/proquest_doaj_%3E%3Cgrp_id%3Ecdi_FETCH-LOGICAL-c540t-766bf75b5e02481476e21ed8e4fff015e8dd1dbbc5adab45f8d70d2d161906643%3C/grp_id%3E%3Coa%3E%3C/oa%3E%3Curl%3E%3C/url%3E&rft_id=info:oai/&rft_pqid=2583229290&rft_id=info:pmid/34667200&rfr_iscdi=true