Loading…

Variational Bayes for high-dimensional proportional hazards models with applications within gene expression

Abstract Motivation Few Bayesian methods for analyzing high-dimensional sparse survival data provide scalable variable selection, effect estimation and uncertainty quantification. Such methods often either sacrifice uncertainty quantification by computing maximum a posteriori estimates, or quantify...

Full description

Saved in:
Bibliographic Details
Published in:Bioinformatics 2022-08, Vol.38 (16), p.3918-3926
Main Authors: Komodromos, Michael, Aboagye, Eric O, Evangelou, Marina, Filippi, Sarah, Ray, Kolyan
Format: Article
Language:English
Subjects:
Citations: Items that this one cites
Items that cite this one
Online Access:Get full text
Tags: Add Tag
No Tags, Be the first to tag this record!
cited_by cdi_FETCH-LOGICAL-c456t-87e1aae111e8029997ba775b24109940c49065118a0f16066df5ad10de8bec1f3
cites cdi_FETCH-LOGICAL-c456t-87e1aae111e8029997ba775b24109940c49065118a0f16066df5ad10de8bec1f3
container_end_page 3926
container_issue 16
container_start_page 3918
container_title Bioinformatics
container_volume 38
creator Komodromos, Michael
Aboagye, Eric O
Evangelou, Marina
Filippi, Sarah
Ray, Kolyan
description Abstract Motivation Few Bayesian methods for analyzing high-dimensional sparse survival data provide scalable variable selection, effect estimation and uncertainty quantification. Such methods often either sacrifice uncertainty quantification by computing maximum a posteriori estimates, or quantify the uncertainty at high (unscalable) computational expense. Results We bridge this gap and develop an interpretable and scalable Bayesian proportional hazards model for prediction and variable selection, referred to as sparse variational Bayes. Our method, based on a mean-field variational approximation, overcomes the high computational cost of Markov chain Monte Carlo, whilst retaining useful features, providing a posterior distribution for the parameters and offering a natural mechanism for variable selection via posterior inclusion probabilities. The performance of our proposed method is assessed via extensive simulations and compared against other state-of-the-art Bayesian variable selection methods, demonstrating comparable or better performance. Finally, we demonstrate how the proposed method can be used for variable selection on two transcriptomic datasets with censored survival outcomes, and how the uncertainty quantification offered by our method can be used to provide an interpretable assessment of patient risk. Availability and implementation our method has been implemented as a freely available R package survival.svb (https://github.com/mkomod/survival.svb). Supplementary information Supplementary data are available at Bioinformatics online.
doi_str_mv 10.1093/bioinformatics/btac416
format article
fullrecord <record><control><sourceid>proquest_pubme</sourceid><recordid>TN_cdi_pubmedcentral_primary_oai_pubmedcentral_nih_gov_9364383</recordid><sourceformat>XML</sourceformat><sourcesystem>PC</sourcesystem><oup_id>10.1093/bioinformatics/btac416</oup_id><sourcerecordid>2681042782</sourcerecordid><originalsourceid>FETCH-LOGICAL-c456t-87e1aae111e8029997ba775b24109940c49065118a0f16066df5ad10de8bec1f3</originalsourceid><addsrcrecordid>eNqNkUtvnTAQRq2oUR63-QtXLLsh8YBtzKZSG-VRKVI3abfWYIaLU8DU5ub160PKbdTsuvLIc-bYo4-xNfBT4GV-VjnvhsaHHidn41k1oRWg9tgRCMXTjMvyw1znqkiF5vkhO47xjnMJQogDdpjLQoLU6oj9-onBzQ4_YJd8xSeKyWxNWrdp09r1NMSlNQY_-rDjWnzGUMek9zV1MXlwU5vgOHbO_jEtN25INjRQQo9joPiq-cj2G-winezOFftxeXF7fp3efL_6dv7lJrVCqinVBQEiAQBpnpVlWVRYFLLKxLx5KbgVJVcSQCNvQHGl6kZiDbwmXZGFJl-xz4t33FY91ZaGKWBnxuB6DE_GozPvO4NrzcbfmzJXItf5LPi0EwT_e0txMr2LlroOB_LbaDKlgYus0NmMqgW1wccYqHl7Brh5Tcq8T8rskpoH1_9-8m3sbzQzAAvgt-P_Sl8AFNaq2w</addsrcrecordid><sourcetype>Open Access Repository</sourcetype><iscdi>true</iscdi><recordtype>article</recordtype><pqid>2681042782</pqid></control><display><type>article</type><title>Variational Bayes for high-dimensional proportional hazards models with applications within gene expression</title><source>Oxford Open</source><source>PubMed Central</source><creator>Komodromos, Michael ; Aboagye, Eric O ; Evangelou, Marina ; Filippi, Sarah ; Ray, Kolyan</creator><contributor>Kelso, Janet</contributor><creatorcontrib>Komodromos, Michael ; Aboagye, Eric O ; Evangelou, Marina ; Filippi, Sarah ; Ray, Kolyan ; Kelso, Janet</creatorcontrib><description>Abstract Motivation Few Bayesian methods for analyzing high-dimensional sparse survival data provide scalable variable selection, effect estimation and uncertainty quantification. Such methods often either sacrifice uncertainty quantification by computing maximum a posteriori estimates, or quantify the uncertainty at high (unscalable) computational expense. Results We bridge this gap and develop an interpretable and scalable Bayesian proportional hazards model for prediction and variable selection, referred to as sparse variational Bayes. Our method, based on a mean-field variational approximation, overcomes the high computational cost of Markov chain Monte Carlo, whilst retaining useful features, providing a posterior distribution for the parameters and offering a natural mechanism for variable selection via posterior inclusion probabilities. The performance of our proposed method is assessed via extensive simulations and compared against other state-of-the-art Bayesian variable selection methods, demonstrating comparable or better performance. Finally, we demonstrate how the proposed method can be used for variable selection on two transcriptomic datasets with censored survival outcomes, and how the uncertainty quantification offered by our method can be used to provide an interpretable assessment of patient risk. Availability and implementation our method has been implemented as a freely available R package survival.svb (https://github.com/mkomod/survival.svb). Supplementary information Supplementary data are available at Bioinformatics online.</description><identifier>ISSN: 1367-4803</identifier><identifier>EISSN: 1460-2059</identifier><identifier>EISSN: 1367-4811</identifier><identifier>DOI: 10.1093/bioinformatics/btac416</identifier><identifier>PMID: 35751586</identifier><language>eng</language><publisher>England: Oxford University Press</publisher><subject>Bayes Theorem ; Gene Expression ; Humans ; Markov Chains ; Monte Carlo Method ; Original Papers ; Proportional Hazards Models</subject><ispartof>Bioinformatics, 2022-08, Vol.38 (16), p.3918-3926</ispartof><rights>The Author(s) 2022. Published by Oxford University Press. 2022</rights><rights>The Author(s) 2022. Published by Oxford University Press.</rights><lds50>peer_reviewed</lds50><oa>free_for_read</oa><woscitedreferencessubscribed>false</woscitedreferencessubscribed><citedby>FETCH-LOGICAL-c456t-87e1aae111e8029997ba775b24109940c49065118a0f16066df5ad10de8bec1f3</citedby><cites>FETCH-LOGICAL-c456t-87e1aae111e8029997ba775b24109940c49065118a0f16066df5ad10de8bec1f3</cites><orcidid>0000-0001-8662-0953</orcidid></display><links><openurl>$$Topenurl_article</openurl><openurlfulltext>$$Topenurlfull_article</openurlfulltext><thumbnail>$$Tsyndetics_thumb_exl</thumbnail><linktopdf>$$Uhttps://www.ncbi.nlm.nih.gov/pmc/articles/PMC9364383/pdf/$$EPDF$$P50$$Gpubmedcentral$$Hfree_for_read</linktopdf><linktohtml>$$Uhttps://www.ncbi.nlm.nih.gov/pmc/articles/PMC9364383/$$EHTML$$P50$$Gpubmedcentral$$Hfree_for_read</linktohtml><link.rule.ids>230,314,724,777,781,882,1599,27905,27906,53772,53774</link.rule.ids><backlink>$$Uhttps://www.ncbi.nlm.nih.gov/pubmed/35751586$$D View this record in MEDLINE/PubMed$$Hfree_for_read</backlink></links><search><contributor>Kelso, Janet</contributor><creatorcontrib>Komodromos, Michael</creatorcontrib><creatorcontrib>Aboagye, Eric O</creatorcontrib><creatorcontrib>Evangelou, Marina</creatorcontrib><creatorcontrib>Filippi, Sarah</creatorcontrib><creatorcontrib>Ray, Kolyan</creatorcontrib><title>Variational Bayes for high-dimensional proportional hazards models with applications within gene expression</title><title>Bioinformatics</title><addtitle>Bioinformatics</addtitle><description>Abstract Motivation Few Bayesian methods for analyzing high-dimensional sparse survival data provide scalable variable selection, effect estimation and uncertainty quantification. Such methods often either sacrifice uncertainty quantification by computing maximum a posteriori estimates, or quantify the uncertainty at high (unscalable) computational expense. Results We bridge this gap and develop an interpretable and scalable Bayesian proportional hazards model for prediction and variable selection, referred to as sparse variational Bayes. Our method, based on a mean-field variational approximation, overcomes the high computational cost of Markov chain Monte Carlo, whilst retaining useful features, providing a posterior distribution for the parameters and offering a natural mechanism for variable selection via posterior inclusion probabilities. The performance of our proposed method is assessed via extensive simulations and compared against other state-of-the-art Bayesian variable selection methods, demonstrating comparable or better performance. Finally, we demonstrate how the proposed method can be used for variable selection on two transcriptomic datasets with censored survival outcomes, and how the uncertainty quantification offered by our method can be used to provide an interpretable assessment of patient risk. Availability and implementation our method has been implemented as a freely available R package survival.svb (https://github.com/mkomod/survival.svb). Supplementary information Supplementary data are available at Bioinformatics online.</description><subject>Bayes Theorem</subject><subject>Gene Expression</subject><subject>Humans</subject><subject>Markov Chains</subject><subject>Monte Carlo Method</subject><subject>Original Papers</subject><subject>Proportional Hazards Models</subject><issn>1367-4803</issn><issn>1460-2059</issn><issn>1367-4811</issn><fulltext>true</fulltext><rsrctype>article</rsrctype><creationdate>2022</creationdate><recordtype>article</recordtype><sourceid>TOX</sourceid><recordid>eNqNkUtvnTAQRq2oUR63-QtXLLsh8YBtzKZSG-VRKVI3abfWYIaLU8DU5ub160PKbdTsuvLIc-bYo4-xNfBT4GV-VjnvhsaHHidn41k1oRWg9tgRCMXTjMvyw1znqkiF5vkhO47xjnMJQogDdpjLQoLU6oj9-onBzQ4_YJd8xSeKyWxNWrdp09r1NMSlNQY_-rDjWnzGUMek9zV1MXlwU5vgOHbO_jEtN25INjRQQo9joPiq-cj2G-winezOFftxeXF7fp3efL_6dv7lJrVCqinVBQEiAQBpnpVlWVRYFLLKxLx5KbgVJVcSQCNvQHGl6kZiDbwmXZGFJl-xz4t33FY91ZaGKWBnxuB6DE_GozPvO4NrzcbfmzJXItf5LPi0EwT_e0txMr2LlroOB_LbaDKlgYus0NmMqgW1wccYqHl7Brh5Tcq8T8rskpoH1_9-8m3sbzQzAAvgt-P_Sl8AFNaq2w</recordid><startdate>20220810</startdate><enddate>20220810</enddate><creator>Komodromos, Michael</creator><creator>Aboagye, Eric O</creator><creator>Evangelou, Marina</creator><creator>Filippi, Sarah</creator><creator>Ray, Kolyan</creator><general>Oxford University Press</general><scope>TOX</scope><scope>CGR</scope><scope>CUY</scope><scope>CVF</scope><scope>ECM</scope><scope>EIF</scope><scope>NPM</scope><scope>AAYXX</scope><scope>CITATION</scope><scope>7X8</scope><scope>5PM</scope><orcidid>https://orcid.org/0000-0001-8662-0953</orcidid></search><sort><creationdate>20220810</creationdate><title>Variational Bayes for high-dimensional proportional hazards models with applications within gene expression</title><author>Komodromos, Michael ; Aboagye, Eric O ; Evangelou, Marina ; Filippi, Sarah ; Ray, Kolyan</author></sort><facets><frbrtype>5</frbrtype><frbrgroupid>cdi_FETCH-LOGICAL-c456t-87e1aae111e8029997ba775b24109940c49065118a0f16066df5ad10de8bec1f3</frbrgroupid><rsrctype>articles</rsrctype><prefilter>articles</prefilter><language>eng</language><creationdate>2022</creationdate><topic>Bayes Theorem</topic><topic>Gene Expression</topic><topic>Humans</topic><topic>Markov Chains</topic><topic>Monte Carlo Method</topic><topic>Original Papers</topic><topic>Proportional Hazards Models</topic><toplevel>peer_reviewed</toplevel><toplevel>online_resources</toplevel><creatorcontrib>Komodromos, Michael</creatorcontrib><creatorcontrib>Aboagye, Eric O</creatorcontrib><creatorcontrib>Evangelou, Marina</creatorcontrib><creatorcontrib>Filippi, Sarah</creatorcontrib><creatorcontrib>Ray, Kolyan</creatorcontrib><collection>Oxford Open</collection><collection>Medline</collection><collection>MEDLINE</collection><collection>MEDLINE (Ovid)</collection><collection>MEDLINE</collection><collection>MEDLINE</collection><collection>PubMed</collection><collection>CrossRef</collection><collection>MEDLINE - Academic</collection><collection>PubMed Central (Full Participant titles)</collection><jtitle>Bioinformatics</jtitle></facets><delivery><delcategory>Remote Search Resource</delcategory><fulltext>fulltext</fulltext></delivery><addata><au>Komodromos, Michael</au><au>Aboagye, Eric O</au><au>Evangelou, Marina</au><au>Filippi, Sarah</au><au>Ray, Kolyan</au><au>Kelso, Janet</au><format>journal</format><genre>article</genre><ristype>JOUR</ristype><atitle>Variational Bayes for high-dimensional proportional hazards models with applications within gene expression</atitle><jtitle>Bioinformatics</jtitle><addtitle>Bioinformatics</addtitle><date>2022-08-10</date><risdate>2022</risdate><volume>38</volume><issue>16</issue><spage>3918</spage><epage>3926</epage><pages>3918-3926</pages><issn>1367-4803</issn><eissn>1460-2059</eissn><eissn>1367-4811</eissn><abstract>Abstract Motivation Few Bayesian methods for analyzing high-dimensional sparse survival data provide scalable variable selection, effect estimation and uncertainty quantification. Such methods often either sacrifice uncertainty quantification by computing maximum a posteriori estimates, or quantify the uncertainty at high (unscalable) computational expense. Results We bridge this gap and develop an interpretable and scalable Bayesian proportional hazards model for prediction and variable selection, referred to as sparse variational Bayes. Our method, based on a mean-field variational approximation, overcomes the high computational cost of Markov chain Monte Carlo, whilst retaining useful features, providing a posterior distribution for the parameters and offering a natural mechanism for variable selection via posterior inclusion probabilities. The performance of our proposed method is assessed via extensive simulations and compared against other state-of-the-art Bayesian variable selection methods, demonstrating comparable or better performance. Finally, we demonstrate how the proposed method can be used for variable selection on two transcriptomic datasets with censored survival outcomes, and how the uncertainty quantification offered by our method can be used to provide an interpretable assessment of patient risk. Availability and implementation our method has been implemented as a freely available R package survival.svb (https://github.com/mkomod/survival.svb). Supplementary information Supplementary data are available at Bioinformatics online.</abstract><cop>England</cop><pub>Oxford University Press</pub><pmid>35751586</pmid><doi>10.1093/bioinformatics/btac416</doi><tpages>9</tpages><orcidid>https://orcid.org/0000-0001-8662-0953</orcidid><oa>free_for_read</oa></addata></record>
fulltext fulltext
identifier ISSN: 1367-4803
ispartof Bioinformatics, 2022-08, Vol.38 (16), p.3918-3926
issn 1367-4803
1460-2059
1367-4811
language eng
recordid cdi_pubmedcentral_primary_oai_pubmedcentral_nih_gov_9364383
source Oxford Open; PubMed Central
subjects Bayes Theorem
Gene Expression
Humans
Markov Chains
Monte Carlo Method
Original Papers
Proportional Hazards Models
title Variational Bayes for high-dimensional proportional hazards models with applications within gene expression
url http://sfxeu10.hosted.exlibrisgroup.com/loughborough?ctx_ver=Z39.88-2004&ctx_enc=info:ofi/enc:UTF-8&ctx_tim=2025-01-18T23%3A28%3A13IST&url_ver=Z39.88-2004&url_ctx_fmt=infofi/fmt:kev:mtx:ctx&rfr_id=info:sid/primo.exlibrisgroup.com:primo3-Article-proquest_pubme&rft_val_fmt=info:ofi/fmt:kev:mtx:journal&rft.genre=article&rft.atitle=Variational%20Bayes%20for%20high-dimensional%20proportional%20hazards%20models%20with%20applications%20within%20gene%20expression&rft.jtitle=Bioinformatics&rft.au=Komodromos,%20Michael&rft.date=2022-08-10&rft.volume=38&rft.issue=16&rft.spage=3918&rft.epage=3926&rft.pages=3918-3926&rft.issn=1367-4803&rft.eissn=1460-2059&rft_id=info:doi/10.1093/bioinformatics/btac416&rft_dat=%3Cproquest_pubme%3E2681042782%3C/proquest_pubme%3E%3Cgrp_id%3Ecdi_FETCH-LOGICAL-c456t-87e1aae111e8029997ba775b24109940c49065118a0f16066df5ad10de8bec1f3%3C/grp_id%3E%3Coa%3E%3C/oa%3E%3Curl%3E%3C/url%3E&rft_id=info:oai/&rft_pqid=2681042782&rft_id=info:pmid/35751586&rft_oup_id=10.1093/bioinformatics/btac416&rfr_iscdi=true