Loading…

An efficient resampling method for calibrating single and gene-based rare variant association analysis in case-control studies

For aggregation tests of genes or regions, the set of included variants often have small total minor allele counts (MACs), and this is particularly true when the most deleterious sets of variants are considered. When MAC is low, commonly used asymptotic tests are not well calibrated for binary pheno...

Full description

Saved in:
Bibliographic Details
Published in:Biostatistics (Oxford, England) England), 2016-01, Vol.17 (1), p.1-15
Main Authors: Lee, Seunggeun, Fuchsberger, Christian, Kim, Sehee, Scott, Laura
Format: Article
Language:English
Subjects:
Citations: Items that this one cites
Items that cite this one
Online Access:Get full text
Tags: Add Tag
No Tags, Be the first to tag this record!
cited_by cdi_FETCH-LOGICAL-c502t-f0e36c0a9c6b3e2665e9d7103b53a746c9946912a29fa79af3de64f86331c1a33
cites cdi_FETCH-LOGICAL-c502t-f0e36c0a9c6b3e2665e9d7103b53a746c9946912a29fa79af3de64f86331c1a33
container_end_page 15
container_issue 1
container_start_page 1
container_title Biostatistics (Oxford, England)
container_volume 17
creator Lee, Seunggeun
Fuchsberger, Christian
Kim, Sehee
Scott, Laura
description For aggregation tests of genes or regions, the set of included variants often have small total minor allele counts (MACs), and this is particularly true when the most deleterious sets of variants are considered. When MAC is low, commonly used asymptotic tests are not well calibrated for binary phenotypes and can have conservative or anti-conservative results and potential power loss. Empirical p-values obtained via resampling methods are computationally costly for highly significant p-values and the results can be conservative due to the discrete nature of resampling tests. Based on the observation that only the individuals containing minor alleles contribute to the score statistics, we develop an efficient resampling method for single and multiple variant score-based tests that can adjust for covariates. Our method can improve computational efficiency >1000-fold over conventional resampling for low MAC variant sets. We ameliorate the conservativeness of results through the use of mid-p-values. Using the estimated minimum achievable p-value for each test, we calibrate QQ plots and provide an effective number of tests. In analysis of a case-control study with deep exome sequence, we demonstrate that our methods are both well calibrated and also reduce computation time significantly compared with resampling methods.
doi_str_mv 10.1093/biostatistics/kxv033
format article
fullrecord <record><control><sourceid>proquest_pubme</sourceid><recordid>TN_cdi_pubmedcentral_primary_oai_pubmedcentral_nih_gov_4692986</recordid><sourceformat>XML</sourceformat><sourcesystem>PC</sourcesystem><sourcerecordid>1750001966</sourcerecordid><originalsourceid>FETCH-LOGICAL-c502t-f0e36c0a9c6b3e2665e9d7103b53a746c9946912a29fa79af3de64f86331c1a33</originalsourceid><addsrcrecordid>eNpdkUFv1DAQhS0Eou3CP0DIEhcuoXacOPUFqapoi1SJC5ytiTPZujj24klW7YXfjrdbqrYXP8v-5o3Hj7EPUnyRwqjj3ieaYfY0e0fHv2-3QqlX7FA2-qRqVNu9vt-3VaOb5oAdEd0IUddKq7fsoNZFheoO2d_TyHEcvfMYZ56RYNoEH9d8wvk6DXxMmTsIvs-lVTmmsgTkEAe-xohVD4QDz5CRbyF7KCZAlJwveIqFg3BHnriPxYawcinOOQVO8zJ4pHfszQiB8P2Drtiv828_zy6rqx8X389OryrXinquRoFKOwHG6V5hrXWLZuikUH2roGu0M6bRRtZQmxE6A6MaUDfjiVZKOglKrdjXve9m6SccXBk2Q7Cb7CfIdzaBt89vor-267S1xbY2xWfFPj8Y5PRnQZrt5MlhCBAxLWRl1wohpNE79NML9CYtuXzEPaVKJjtZsWZPuZyIMo6Pj5HC7gK2zwK2-4BL2cengzwW_U9U_QMGu6pN</addsrcrecordid><sourcetype>Open Access Repository</sourcetype><iscdi>true</iscdi><recordtype>article</recordtype><pqid>1753465175</pqid></control><display><type>article</type><title>An efficient resampling method for calibrating single and gene-based rare variant association analysis in case-control studies</title><source>Oxford Journals Online</source><creator>Lee, Seunggeun ; Fuchsberger, Christian ; Kim, Sehee ; Scott, Laura</creator><creatorcontrib>Lee, Seunggeun ; Fuchsberger, Christian ; Kim, Sehee ; Scott, Laura</creatorcontrib><description>For aggregation tests of genes or regions, the set of included variants often have small total minor allele counts (MACs), and this is particularly true when the most deleterious sets of variants are considered. When MAC is low, commonly used asymptotic tests are not well calibrated for binary phenotypes and can have conservative or anti-conservative results and potential power loss. Empirical p-values obtained via resampling methods are computationally costly for highly significant p-values and the results can be conservative due to the discrete nature of resampling tests. Based on the observation that only the individuals containing minor alleles contribute to the score statistics, we develop an efficient resampling method for single and multiple variant score-based tests that can adjust for covariates. Our method can improve computational efficiency &gt;1000-fold over conventional resampling for low MAC variant sets. We ameliorate the conservativeness of results through the use of mid-p-values. Using the estimated minimum achievable p-value for each test, we calibrate QQ plots and provide an effective number of tests. In analysis of a case-control study with deep exome sequence, we demonstrate that our methods are both well calibrated and also reduce computation time significantly compared with resampling methods.</description><identifier>ISSN: 1465-4644</identifier><identifier>EISSN: 1468-4357</identifier><identifier>DOI: 10.1093/biostatistics/kxv033</identifier><identifier>PMID: 26363037</identifier><language>eng</language><publisher>England: Oxford Publishing Limited (England)</publisher><subject>Calibration ; Efficiency ; Genes ; Genetic Association Studies - methods ; Genetic Variation ; Genotype &amp; phenotype ; Humans ; Models, Statistical ; Sampling ; Sequence Analysis, DNA - methods</subject><ispartof>Biostatistics (Oxford, England), 2016-01, Vol.17 (1), p.1-15</ispartof><rights>The Author 2015. Published by Oxford University Press. All rights reserved. For permissions, please e-mail: journals.permissions@oup.com.</rights><rights>Copyright Oxford Publishing Limited(England) Jan 2016</rights><rights>The Author 2015. Published by Oxford University Press. All rights reserved. For permissions, please e-mail: journals.permissions@oup.com. 2015</rights><lds50>peer_reviewed</lds50><oa>free_for_read</oa><woscitedreferencessubscribed>false</woscitedreferencessubscribed><citedby>FETCH-LOGICAL-c502t-f0e36c0a9c6b3e2665e9d7103b53a746c9946912a29fa79af3de64f86331c1a33</citedby><cites>FETCH-LOGICAL-c502t-f0e36c0a9c6b3e2665e9d7103b53a746c9946912a29fa79af3de64f86331c1a33</cites></display><links><openurl>$$Topenurl_article</openurl><openurlfulltext>$$Topenurlfull_article</openurlfulltext><thumbnail>$$Tsyndetics_thumb_exl</thumbnail><link.rule.ids>230,314,776,780,881,27901,27902</link.rule.ids><backlink>$$Uhttps://www.ncbi.nlm.nih.gov/pubmed/26363037$$D View this record in MEDLINE/PubMed$$Hfree_for_read</backlink></links><search><creatorcontrib>Lee, Seunggeun</creatorcontrib><creatorcontrib>Fuchsberger, Christian</creatorcontrib><creatorcontrib>Kim, Sehee</creatorcontrib><creatorcontrib>Scott, Laura</creatorcontrib><title>An efficient resampling method for calibrating single and gene-based rare variant association analysis in case-control studies</title><title>Biostatistics (Oxford, England)</title><addtitle>Biostatistics</addtitle><description>For aggregation tests of genes or regions, the set of included variants often have small total minor allele counts (MACs), and this is particularly true when the most deleterious sets of variants are considered. When MAC is low, commonly used asymptotic tests are not well calibrated for binary phenotypes and can have conservative or anti-conservative results and potential power loss. Empirical p-values obtained via resampling methods are computationally costly for highly significant p-values and the results can be conservative due to the discrete nature of resampling tests. Based on the observation that only the individuals containing minor alleles contribute to the score statistics, we develop an efficient resampling method for single and multiple variant score-based tests that can adjust for covariates. Our method can improve computational efficiency &gt;1000-fold over conventional resampling for low MAC variant sets. We ameliorate the conservativeness of results through the use of mid-p-values. Using the estimated minimum achievable p-value for each test, we calibrate QQ plots and provide an effective number of tests. In analysis of a case-control study with deep exome sequence, we demonstrate that our methods are both well calibrated and also reduce computation time significantly compared with resampling methods.</description><subject>Calibration</subject><subject>Efficiency</subject><subject>Genes</subject><subject>Genetic Association Studies - methods</subject><subject>Genetic Variation</subject><subject>Genotype &amp; phenotype</subject><subject>Humans</subject><subject>Models, Statistical</subject><subject>Sampling</subject><subject>Sequence Analysis, DNA - methods</subject><issn>1465-4644</issn><issn>1468-4357</issn><fulltext>true</fulltext><rsrctype>article</rsrctype><creationdate>2016</creationdate><recordtype>article</recordtype><recordid>eNpdkUFv1DAQhS0Eou3CP0DIEhcuoXacOPUFqapoi1SJC5ytiTPZujj24klW7YXfjrdbqrYXP8v-5o3Hj7EPUnyRwqjj3ieaYfY0e0fHv2-3QqlX7FA2-qRqVNu9vt-3VaOb5oAdEd0IUddKq7fsoNZFheoO2d_TyHEcvfMYZ56RYNoEH9d8wvk6DXxMmTsIvs-lVTmmsgTkEAe-xohVD4QDz5CRbyF7KCZAlJwveIqFg3BHnriPxYawcinOOQVO8zJ4pHfszQiB8P2Drtiv828_zy6rqx8X389OryrXinquRoFKOwHG6V5hrXWLZuikUH2roGu0M6bRRtZQmxE6A6MaUDfjiVZKOglKrdjXve9m6SccXBk2Q7Cb7CfIdzaBt89vor-267S1xbY2xWfFPj8Y5PRnQZrt5MlhCBAxLWRl1wohpNE79NML9CYtuXzEPaVKJjtZsWZPuZyIMo6Pj5HC7gK2zwK2-4BL2cengzwW_U9U_QMGu6pN</recordid><startdate>20160101</startdate><enddate>20160101</enddate><creator>Lee, Seunggeun</creator><creator>Fuchsberger, Christian</creator><creator>Kim, Sehee</creator><creator>Scott, Laura</creator><general>Oxford Publishing Limited (England)</general><general>Oxford University Press</general><scope>CGR</scope><scope>CUY</scope><scope>CVF</scope><scope>ECM</scope><scope>EIF</scope><scope>NPM</scope><scope>AAYXX</scope><scope>CITATION</scope><scope>7QO</scope><scope>8FD</scope><scope>FR3</scope><scope>K9.</scope><scope>NAPCQ</scope><scope>P64</scope><scope>RC3</scope><scope>7X8</scope><scope>5PM</scope></search><sort><creationdate>20160101</creationdate><title>An efficient resampling method for calibrating single and gene-based rare variant association analysis in case-control studies</title><author>Lee, Seunggeun ; Fuchsberger, Christian ; Kim, Sehee ; Scott, Laura</author></sort><facets><frbrtype>5</frbrtype><frbrgroupid>cdi_FETCH-LOGICAL-c502t-f0e36c0a9c6b3e2665e9d7103b53a746c9946912a29fa79af3de64f86331c1a33</frbrgroupid><rsrctype>articles</rsrctype><prefilter>articles</prefilter><language>eng</language><creationdate>2016</creationdate><topic>Calibration</topic><topic>Efficiency</topic><topic>Genes</topic><topic>Genetic Association Studies - methods</topic><topic>Genetic Variation</topic><topic>Genotype &amp; phenotype</topic><topic>Humans</topic><topic>Models, Statistical</topic><topic>Sampling</topic><topic>Sequence Analysis, DNA - methods</topic><toplevel>peer_reviewed</toplevel><toplevel>online_resources</toplevel><creatorcontrib>Lee, Seunggeun</creatorcontrib><creatorcontrib>Fuchsberger, Christian</creatorcontrib><creatorcontrib>Kim, Sehee</creatorcontrib><creatorcontrib>Scott, Laura</creatorcontrib><collection>Medline</collection><collection>MEDLINE</collection><collection>MEDLINE (Ovid)</collection><collection>MEDLINE</collection><collection>MEDLINE</collection><collection>PubMed</collection><collection>CrossRef</collection><collection>Biotechnology Research Abstracts</collection><collection>Technology Research Database</collection><collection>Engineering Research Database</collection><collection>ProQuest Health &amp; Medical Complete (Alumni)</collection><collection>Nursing &amp; Allied Health Premium</collection><collection>Biotechnology and BioEngineering Abstracts</collection><collection>Genetics Abstracts</collection><collection>MEDLINE - Academic</collection><collection>PubMed Central (Full Participant titles)</collection><jtitle>Biostatistics (Oxford, England)</jtitle></facets><delivery><delcategory>Remote Search Resource</delcategory><fulltext>fulltext</fulltext></delivery><addata><au>Lee, Seunggeun</au><au>Fuchsberger, Christian</au><au>Kim, Sehee</au><au>Scott, Laura</au><format>journal</format><genre>article</genre><ristype>JOUR</ristype><atitle>An efficient resampling method for calibrating single and gene-based rare variant association analysis in case-control studies</atitle><jtitle>Biostatistics (Oxford, England)</jtitle><addtitle>Biostatistics</addtitle><date>2016-01-01</date><risdate>2016</risdate><volume>17</volume><issue>1</issue><spage>1</spage><epage>15</epage><pages>1-15</pages><issn>1465-4644</issn><eissn>1468-4357</eissn><abstract>For aggregation tests of genes or regions, the set of included variants often have small total minor allele counts (MACs), and this is particularly true when the most deleterious sets of variants are considered. When MAC is low, commonly used asymptotic tests are not well calibrated for binary phenotypes and can have conservative or anti-conservative results and potential power loss. Empirical p-values obtained via resampling methods are computationally costly for highly significant p-values and the results can be conservative due to the discrete nature of resampling tests. Based on the observation that only the individuals containing minor alleles contribute to the score statistics, we develop an efficient resampling method for single and multiple variant score-based tests that can adjust for covariates. Our method can improve computational efficiency &gt;1000-fold over conventional resampling for low MAC variant sets. We ameliorate the conservativeness of results through the use of mid-p-values. Using the estimated minimum achievable p-value for each test, we calibrate QQ plots and provide an effective number of tests. In analysis of a case-control study with deep exome sequence, we demonstrate that our methods are both well calibrated and also reduce computation time significantly compared with resampling methods.</abstract><cop>England</cop><pub>Oxford Publishing Limited (England)</pub><pmid>26363037</pmid><doi>10.1093/biostatistics/kxv033</doi><tpages>15</tpages><oa>free_for_read</oa></addata></record>
fulltext fulltext
identifier ISSN: 1465-4644
ispartof Biostatistics (Oxford, England), 2016-01, Vol.17 (1), p.1-15
issn 1465-4644
1468-4357
language eng
recordid cdi_pubmedcentral_primary_oai_pubmedcentral_nih_gov_4692986
source Oxford Journals Online
subjects Calibration
Efficiency
Genes
Genetic Association Studies - methods
Genetic Variation
Genotype & phenotype
Humans
Models, Statistical
Sampling
Sequence Analysis, DNA - methods
title An efficient resampling method for calibrating single and gene-based rare variant association analysis in case-control studies
url http://sfxeu10.hosted.exlibrisgroup.com/loughborough?ctx_ver=Z39.88-2004&ctx_enc=info:ofi/enc:UTF-8&ctx_tim=2025-02-02T20%3A03%3A40IST&url_ver=Z39.88-2004&url_ctx_fmt=infofi/fmt:kev:mtx:ctx&rfr_id=info:sid/primo.exlibrisgroup.com:primo3-Article-proquest_pubme&rft_val_fmt=info:ofi/fmt:kev:mtx:journal&rft.genre=article&rft.atitle=An%20efficient%20resampling%20method%20for%20calibrating%20single%20and%20gene-based%20rare%20variant%20association%20analysis%20in%20case-control%20studies&rft.jtitle=Biostatistics%20(Oxford,%20England)&rft.au=Lee,%20Seunggeun&rft.date=2016-01-01&rft.volume=17&rft.issue=1&rft.spage=1&rft.epage=15&rft.pages=1-15&rft.issn=1465-4644&rft.eissn=1468-4357&rft_id=info:doi/10.1093/biostatistics/kxv033&rft_dat=%3Cproquest_pubme%3E1750001966%3C/proquest_pubme%3E%3Cgrp_id%3Ecdi_FETCH-LOGICAL-c502t-f0e36c0a9c6b3e2665e9d7103b53a746c9946912a29fa79af3de64f86331c1a33%3C/grp_id%3E%3Coa%3E%3C/oa%3E%3Curl%3E%3C/url%3E&rft_id=info:oai/&rft_pqid=1753465175&rft_id=info:pmid/26363037&rfr_iscdi=true