Loading…
Variational Bayes for high-dimensional proportional hazards models with applications within gene expression
Abstract Motivation Few Bayesian methods for analyzing high-dimensional sparse survival data provide scalable variable selection, effect estimation and uncertainty quantification. Such methods often either sacrifice uncertainty quantification by computing maximum a posteriori estimates, or quantify...
Saved in:
Published in: | Bioinformatics 2022-08, Vol.38 (16), p.3918-3926 |
---|---|
Main Authors: | , , , , |
Format: | Article |
Language: | English |
Subjects: | |
Citations: | Items that this one cites Items that cite this one |
Online Access: | Get full text |
Tags: |
Add Tag
No Tags, Be the first to tag this record!
|
cited_by | cdi_FETCH-LOGICAL-c456t-87e1aae111e8029997ba775b24109940c49065118a0f16066df5ad10de8bec1f3 |
---|---|
cites | cdi_FETCH-LOGICAL-c456t-87e1aae111e8029997ba775b24109940c49065118a0f16066df5ad10de8bec1f3 |
container_end_page | 3926 |
container_issue | 16 |
container_start_page | 3918 |
container_title | Bioinformatics |
container_volume | 38 |
creator | Komodromos, Michael Aboagye, Eric O Evangelou, Marina Filippi, Sarah Ray, Kolyan |
description | Abstract
Motivation
Few Bayesian methods for analyzing high-dimensional sparse survival data provide scalable variable selection, effect estimation and uncertainty quantification. Such methods often either sacrifice uncertainty quantification by computing maximum a posteriori estimates, or quantify the uncertainty at high (unscalable) computational expense.
Results
We bridge this gap and develop an interpretable and scalable Bayesian proportional hazards model for prediction and variable selection, referred to as sparse variational Bayes. Our method, based on a mean-field variational approximation, overcomes the high computational cost of Markov chain Monte Carlo, whilst retaining useful features, providing a posterior distribution for the parameters and offering a natural mechanism for variable selection via posterior inclusion probabilities. The performance of our proposed method is assessed via extensive simulations and compared against other state-of-the-art Bayesian variable selection methods, demonstrating comparable or better performance. Finally, we demonstrate how the proposed method can be used for variable selection on two transcriptomic datasets with censored survival outcomes, and how the uncertainty quantification offered by our method can be used to provide an interpretable assessment of patient risk.
Availability and implementation
our method has been implemented as a freely available R package survival.svb (https://github.com/mkomod/survival.svb).
Supplementary information
Supplementary data are available at Bioinformatics online. |
doi_str_mv | 10.1093/bioinformatics/btac416 |
format | article |
fullrecord | <record><control><sourceid>proquest_pubme</sourceid><recordid>TN_cdi_pubmedcentral_primary_oai_pubmedcentral_nih_gov_9364383</recordid><sourceformat>XML</sourceformat><sourcesystem>PC</sourcesystem><oup_id>10.1093/bioinformatics/btac416</oup_id><sourcerecordid>2681042782</sourcerecordid><originalsourceid>FETCH-LOGICAL-c456t-87e1aae111e8029997ba775b24109940c49065118a0f16066df5ad10de8bec1f3</originalsourceid><addsrcrecordid>eNqNkUtvnTAQRq2oUR63-QtXLLsh8YBtzKZSG-VRKVI3abfWYIaLU8DU5ub160PKbdTsuvLIc-bYo4-xNfBT4GV-VjnvhsaHHidn41k1oRWg9tgRCMXTjMvyw1znqkiF5vkhO47xjnMJQogDdpjLQoLU6oj9-onBzQ4_YJd8xSeKyWxNWrdp09r1NMSlNQY_-rDjWnzGUMek9zV1MXlwU5vgOHbO_jEtN25INjRQQo9joPiq-cj2G-winezOFftxeXF7fp3efL_6dv7lJrVCqinVBQEiAQBpnpVlWVRYFLLKxLx5KbgVJVcSQCNvQHGl6kZiDbwmXZGFJl-xz4t33FY91ZaGKWBnxuB6DE_GozPvO4NrzcbfmzJXItf5LPi0EwT_e0txMr2LlroOB_LbaDKlgYus0NmMqgW1wccYqHl7Brh5Tcq8T8rskpoH1_9-8m3sbzQzAAvgt-P_Sl8AFNaq2w</addsrcrecordid><sourcetype>Open Access Repository</sourcetype><iscdi>true</iscdi><recordtype>article</recordtype><pqid>2681042782</pqid></control><display><type>article</type><title>Variational Bayes for high-dimensional proportional hazards models with applications within gene expression</title><source>Oxford Open</source><source>PubMed Central</source><creator>Komodromos, Michael ; Aboagye, Eric O ; Evangelou, Marina ; Filippi, Sarah ; Ray, Kolyan</creator><contributor>Kelso, Janet</contributor><creatorcontrib>Komodromos, Michael ; Aboagye, Eric O ; Evangelou, Marina ; Filippi, Sarah ; Ray, Kolyan ; Kelso, Janet</creatorcontrib><description>Abstract
Motivation
Few Bayesian methods for analyzing high-dimensional sparse survival data provide scalable variable selection, effect estimation and uncertainty quantification. Such methods often either sacrifice uncertainty quantification by computing maximum a posteriori estimates, or quantify the uncertainty at high (unscalable) computational expense.
Results
We bridge this gap and develop an interpretable and scalable Bayesian proportional hazards model for prediction and variable selection, referred to as sparse variational Bayes. Our method, based on a mean-field variational approximation, overcomes the high computational cost of Markov chain Monte Carlo, whilst retaining useful features, providing a posterior distribution for the parameters and offering a natural mechanism for variable selection via posterior inclusion probabilities. The performance of our proposed method is assessed via extensive simulations and compared against other state-of-the-art Bayesian variable selection methods, demonstrating comparable or better performance. Finally, we demonstrate how the proposed method can be used for variable selection on two transcriptomic datasets with censored survival outcomes, and how the uncertainty quantification offered by our method can be used to provide an interpretable assessment of patient risk.
Availability and implementation
our method has been implemented as a freely available R package survival.svb (https://github.com/mkomod/survival.svb).
Supplementary information
Supplementary data are available at Bioinformatics online.</description><identifier>ISSN: 1367-4803</identifier><identifier>EISSN: 1460-2059</identifier><identifier>EISSN: 1367-4811</identifier><identifier>DOI: 10.1093/bioinformatics/btac416</identifier><identifier>PMID: 35751586</identifier><language>eng</language><publisher>England: Oxford University Press</publisher><subject>Bayes Theorem ; Gene Expression ; Humans ; Markov Chains ; Monte Carlo Method ; Original Papers ; Proportional Hazards Models</subject><ispartof>Bioinformatics, 2022-08, Vol.38 (16), p.3918-3926</ispartof><rights>The Author(s) 2022. Published by Oxford University Press. 2022</rights><rights>The Author(s) 2022. Published by Oxford University Press.</rights><lds50>peer_reviewed</lds50><oa>free_for_read</oa><woscitedreferencessubscribed>false</woscitedreferencessubscribed><citedby>FETCH-LOGICAL-c456t-87e1aae111e8029997ba775b24109940c49065118a0f16066df5ad10de8bec1f3</citedby><cites>FETCH-LOGICAL-c456t-87e1aae111e8029997ba775b24109940c49065118a0f16066df5ad10de8bec1f3</cites><orcidid>0000-0001-8662-0953</orcidid></display><links><openurl>$$Topenurl_article</openurl><openurlfulltext>$$Topenurlfull_article</openurlfulltext><thumbnail>$$Tsyndetics_thumb_exl</thumbnail><linktopdf>$$Uhttps://www.ncbi.nlm.nih.gov/pmc/articles/PMC9364383/pdf/$$EPDF$$P50$$Gpubmedcentral$$Hfree_for_read</linktopdf><linktohtml>$$Uhttps://www.ncbi.nlm.nih.gov/pmc/articles/PMC9364383/$$EHTML$$P50$$Gpubmedcentral$$Hfree_for_read</linktohtml><link.rule.ids>230,314,724,777,781,882,1599,27905,27906,53772,53774</link.rule.ids><backlink>$$Uhttps://www.ncbi.nlm.nih.gov/pubmed/35751586$$D View this record in MEDLINE/PubMed$$Hfree_for_read</backlink></links><search><contributor>Kelso, Janet</contributor><creatorcontrib>Komodromos, Michael</creatorcontrib><creatorcontrib>Aboagye, Eric O</creatorcontrib><creatorcontrib>Evangelou, Marina</creatorcontrib><creatorcontrib>Filippi, Sarah</creatorcontrib><creatorcontrib>Ray, Kolyan</creatorcontrib><title>Variational Bayes for high-dimensional proportional hazards models with applications within gene expression</title><title>Bioinformatics</title><addtitle>Bioinformatics</addtitle><description>Abstract
Motivation
Few Bayesian methods for analyzing high-dimensional sparse survival data provide scalable variable selection, effect estimation and uncertainty quantification. Such methods often either sacrifice uncertainty quantification by computing maximum a posteriori estimates, or quantify the uncertainty at high (unscalable) computational expense.
Results
We bridge this gap and develop an interpretable and scalable Bayesian proportional hazards model for prediction and variable selection, referred to as sparse variational Bayes. Our method, based on a mean-field variational approximation, overcomes the high computational cost of Markov chain Monte Carlo, whilst retaining useful features, providing a posterior distribution for the parameters and offering a natural mechanism for variable selection via posterior inclusion probabilities. The performance of our proposed method is assessed via extensive simulations and compared against other state-of-the-art Bayesian variable selection methods, demonstrating comparable or better performance. Finally, we demonstrate how the proposed method can be used for variable selection on two transcriptomic datasets with censored survival outcomes, and how the uncertainty quantification offered by our method can be used to provide an interpretable assessment of patient risk.
Availability and implementation
our method has been implemented as a freely available R package survival.svb (https://github.com/mkomod/survival.svb).
Supplementary information
Supplementary data are available at Bioinformatics online.</description><subject>Bayes Theorem</subject><subject>Gene Expression</subject><subject>Humans</subject><subject>Markov Chains</subject><subject>Monte Carlo Method</subject><subject>Original Papers</subject><subject>Proportional Hazards Models</subject><issn>1367-4803</issn><issn>1460-2059</issn><issn>1367-4811</issn><fulltext>true</fulltext><rsrctype>article</rsrctype><creationdate>2022</creationdate><recordtype>article</recordtype><sourceid>TOX</sourceid><recordid>eNqNkUtvnTAQRq2oUR63-QtXLLsh8YBtzKZSG-VRKVI3abfWYIaLU8DU5ub160PKbdTsuvLIc-bYo4-xNfBT4GV-VjnvhsaHHidn41k1oRWg9tgRCMXTjMvyw1znqkiF5vkhO47xjnMJQogDdpjLQoLU6oj9-onBzQ4_YJd8xSeKyWxNWrdp09r1NMSlNQY_-rDjWnzGUMek9zV1MXlwU5vgOHbO_jEtN25INjRQQo9joPiq-cj2G-winezOFftxeXF7fp3efL_6dv7lJrVCqinVBQEiAQBpnpVlWVRYFLLKxLx5KbgVJVcSQCNvQHGl6kZiDbwmXZGFJl-xz4t33FY91ZaGKWBnxuB6DE_GozPvO4NrzcbfmzJXItf5LPi0EwT_e0txMr2LlroOB_LbaDKlgYus0NmMqgW1wccYqHl7Brh5Tcq8T8rskpoH1_9-8m3sbzQzAAvgt-P_Sl8AFNaq2w</recordid><startdate>20220810</startdate><enddate>20220810</enddate><creator>Komodromos, Michael</creator><creator>Aboagye, Eric O</creator><creator>Evangelou, Marina</creator><creator>Filippi, Sarah</creator><creator>Ray, Kolyan</creator><general>Oxford University Press</general><scope>TOX</scope><scope>CGR</scope><scope>CUY</scope><scope>CVF</scope><scope>ECM</scope><scope>EIF</scope><scope>NPM</scope><scope>AAYXX</scope><scope>CITATION</scope><scope>7X8</scope><scope>5PM</scope><orcidid>https://orcid.org/0000-0001-8662-0953</orcidid></search><sort><creationdate>20220810</creationdate><title>Variational Bayes for high-dimensional proportional hazards models with applications within gene expression</title><author>Komodromos, Michael ; Aboagye, Eric O ; Evangelou, Marina ; Filippi, Sarah ; Ray, Kolyan</author></sort><facets><frbrtype>5</frbrtype><frbrgroupid>cdi_FETCH-LOGICAL-c456t-87e1aae111e8029997ba775b24109940c49065118a0f16066df5ad10de8bec1f3</frbrgroupid><rsrctype>articles</rsrctype><prefilter>articles</prefilter><language>eng</language><creationdate>2022</creationdate><topic>Bayes Theorem</topic><topic>Gene Expression</topic><topic>Humans</topic><topic>Markov Chains</topic><topic>Monte Carlo Method</topic><topic>Original Papers</topic><topic>Proportional Hazards Models</topic><toplevel>peer_reviewed</toplevel><toplevel>online_resources</toplevel><creatorcontrib>Komodromos, Michael</creatorcontrib><creatorcontrib>Aboagye, Eric O</creatorcontrib><creatorcontrib>Evangelou, Marina</creatorcontrib><creatorcontrib>Filippi, Sarah</creatorcontrib><creatorcontrib>Ray, Kolyan</creatorcontrib><collection>Oxford Open</collection><collection>Medline</collection><collection>MEDLINE</collection><collection>MEDLINE (Ovid)</collection><collection>MEDLINE</collection><collection>MEDLINE</collection><collection>PubMed</collection><collection>CrossRef</collection><collection>MEDLINE - Academic</collection><collection>PubMed Central (Full Participant titles)</collection><jtitle>Bioinformatics</jtitle></facets><delivery><delcategory>Remote Search Resource</delcategory><fulltext>fulltext</fulltext></delivery><addata><au>Komodromos, Michael</au><au>Aboagye, Eric O</au><au>Evangelou, Marina</au><au>Filippi, Sarah</au><au>Ray, Kolyan</au><au>Kelso, Janet</au><format>journal</format><genre>article</genre><ristype>JOUR</ristype><atitle>Variational Bayes for high-dimensional proportional hazards models with applications within gene expression</atitle><jtitle>Bioinformatics</jtitle><addtitle>Bioinformatics</addtitle><date>2022-08-10</date><risdate>2022</risdate><volume>38</volume><issue>16</issue><spage>3918</spage><epage>3926</epage><pages>3918-3926</pages><issn>1367-4803</issn><eissn>1460-2059</eissn><eissn>1367-4811</eissn><abstract>Abstract
Motivation
Few Bayesian methods for analyzing high-dimensional sparse survival data provide scalable variable selection, effect estimation and uncertainty quantification. Such methods often either sacrifice uncertainty quantification by computing maximum a posteriori estimates, or quantify the uncertainty at high (unscalable) computational expense.
Results
We bridge this gap and develop an interpretable and scalable Bayesian proportional hazards model for prediction and variable selection, referred to as sparse variational Bayes. Our method, based on a mean-field variational approximation, overcomes the high computational cost of Markov chain Monte Carlo, whilst retaining useful features, providing a posterior distribution for the parameters and offering a natural mechanism for variable selection via posterior inclusion probabilities. The performance of our proposed method is assessed via extensive simulations and compared against other state-of-the-art Bayesian variable selection methods, demonstrating comparable or better performance. Finally, we demonstrate how the proposed method can be used for variable selection on two transcriptomic datasets with censored survival outcomes, and how the uncertainty quantification offered by our method can be used to provide an interpretable assessment of patient risk.
Availability and implementation
our method has been implemented as a freely available R package survival.svb (https://github.com/mkomod/survival.svb).
Supplementary information
Supplementary data are available at Bioinformatics online.</abstract><cop>England</cop><pub>Oxford University Press</pub><pmid>35751586</pmid><doi>10.1093/bioinformatics/btac416</doi><tpages>9</tpages><orcidid>https://orcid.org/0000-0001-8662-0953</orcidid><oa>free_for_read</oa></addata></record> |
fulltext | fulltext |
identifier | ISSN: 1367-4803 |
ispartof | Bioinformatics, 2022-08, Vol.38 (16), p.3918-3926 |
issn | 1367-4803 1460-2059 1367-4811 |
language | eng |
recordid | cdi_pubmedcentral_primary_oai_pubmedcentral_nih_gov_9364383 |
source | Oxford Open; PubMed Central |
subjects | Bayes Theorem Gene Expression Humans Markov Chains Monte Carlo Method Original Papers Proportional Hazards Models |
title | Variational Bayes for high-dimensional proportional hazards models with applications within gene expression |
url | http://sfxeu10.hosted.exlibrisgroup.com/loughborough?ctx_ver=Z39.88-2004&ctx_enc=info:ofi/enc:UTF-8&ctx_tim=2025-01-18T23%3A28%3A13IST&url_ver=Z39.88-2004&url_ctx_fmt=infofi/fmt:kev:mtx:ctx&rfr_id=info:sid/primo.exlibrisgroup.com:primo3-Article-proquest_pubme&rft_val_fmt=info:ofi/fmt:kev:mtx:journal&rft.genre=article&rft.atitle=Variational%20Bayes%20for%20high-dimensional%20proportional%20hazards%20models%20with%20applications%20within%20gene%20expression&rft.jtitle=Bioinformatics&rft.au=Komodromos,%20Michael&rft.date=2022-08-10&rft.volume=38&rft.issue=16&rft.spage=3918&rft.epage=3926&rft.pages=3918-3926&rft.issn=1367-4803&rft.eissn=1460-2059&rft_id=info:doi/10.1093/bioinformatics/btac416&rft_dat=%3Cproquest_pubme%3E2681042782%3C/proquest_pubme%3E%3Cgrp_id%3Ecdi_FETCH-LOGICAL-c456t-87e1aae111e8029997ba775b24109940c49065118a0f16066df5ad10de8bec1f3%3C/grp_id%3E%3Coa%3E%3C/oa%3E%3Curl%3E%3C/url%3E&rft_id=info:oai/&rft_pqid=2681042782&rft_id=info:pmid/35751586&rft_oup_id=10.1093/bioinformatics/btac416&rfr_iscdi=true |