Loading…
An efficient resampling method for calibrating single and gene-based rare variant association analysis in case-control studies
For aggregation tests of genes or regions, the set of included variants often have small total minor allele counts (MACs), and this is particularly true when the most deleterious sets of variants are considered. When MAC is low, commonly used asymptotic tests are not well calibrated for binary pheno...
Saved in:
Published in: | Biostatistics (Oxford, England) England), 2016-01, Vol.17 (1), p.1-15 |
---|---|
Main Authors: | , , , |
Format: | Article |
Language: | English |
Subjects: | |
Citations: | Items that this one cites Items that cite this one |
Online Access: | Get full text |
Tags: |
Add Tag
No Tags, Be the first to tag this record!
|
cited_by | cdi_FETCH-LOGICAL-c502t-f0e36c0a9c6b3e2665e9d7103b53a746c9946912a29fa79af3de64f86331c1a33 |
---|---|
cites | cdi_FETCH-LOGICAL-c502t-f0e36c0a9c6b3e2665e9d7103b53a746c9946912a29fa79af3de64f86331c1a33 |
container_end_page | 15 |
container_issue | 1 |
container_start_page | 1 |
container_title | Biostatistics (Oxford, England) |
container_volume | 17 |
creator | Lee, Seunggeun Fuchsberger, Christian Kim, Sehee Scott, Laura |
description | For aggregation tests of genes or regions, the set of included variants often have small total minor allele counts (MACs), and this is particularly true when the most deleterious sets of variants are considered. When MAC is low, commonly used asymptotic tests are not well calibrated for binary phenotypes and can have conservative or anti-conservative results and potential power loss. Empirical p-values obtained via resampling methods are computationally costly for highly significant p-values and the results can be conservative due to the discrete nature of resampling tests. Based on the observation that only the individuals containing minor alleles contribute to the score statistics, we develop an efficient resampling method for single and multiple variant score-based tests that can adjust for covariates. Our method can improve computational efficiency >1000-fold over conventional resampling for low MAC variant sets. We ameliorate the conservativeness of results through the use of mid-p-values. Using the estimated minimum achievable p-value for each test, we calibrate QQ plots and provide an effective number of tests. In analysis of a case-control study with deep exome sequence, we demonstrate that our methods are both well calibrated and also reduce computation time significantly compared with resampling methods. |
doi_str_mv | 10.1093/biostatistics/kxv033 |
format | article |
fullrecord | <record><control><sourceid>proquest_pubme</sourceid><recordid>TN_cdi_pubmedcentral_primary_oai_pubmedcentral_nih_gov_4692986</recordid><sourceformat>XML</sourceformat><sourcesystem>PC</sourcesystem><sourcerecordid>1750001966</sourcerecordid><originalsourceid>FETCH-LOGICAL-c502t-f0e36c0a9c6b3e2665e9d7103b53a746c9946912a29fa79af3de64f86331c1a33</originalsourceid><addsrcrecordid>eNpdkUFv1DAQhS0Eou3CP0DIEhcuoXacOPUFqapoi1SJC5ytiTPZujj24klW7YXfjrdbqrYXP8v-5o3Hj7EPUnyRwqjj3ieaYfY0e0fHv2-3QqlX7FA2-qRqVNu9vt-3VaOb5oAdEd0IUddKq7fsoNZFheoO2d_TyHEcvfMYZ56RYNoEH9d8wvk6DXxMmTsIvs-lVTmmsgTkEAe-xohVD4QDz5CRbyF7KCZAlJwveIqFg3BHnriPxYawcinOOQVO8zJ4pHfszQiB8P2Drtiv828_zy6rqx8X389OryrXinquRoFKOwHG6V5hrXWLZuikUH2roGu0M6bRRtZQmxE6A6MaUDfjiVZKOglKrdjXve9m6SccXBk2Q7Cb7CfIdzaBt89vor-267S1xbY2xWfFPj8Y5PRnQZrt5MlhCBAxLWRl1wohpNE79NML9CYtuXzEPaVKJjtZsWZPuZyIMo6Pj5HC7gK2zwK2-4BL2cengzwW_U9U_QMGu6pN</addsrcrecordid><sourcetype>Open Access Repository</sourcetype><iscdi>true</iscdi><recordtype>article</recordtype><pqid>1753465175</pqid></control><display><type>article</type><title>An efficient resampling method for calibrating single and gene-based rare variant association analysis in case-control studies</title><source>Oxford Journals Online</source><creator>Lee, Seunggeun ; Fuchsberger, Christian ; Kim, Sehee ; Scott, Laura</creator><creatorcontrib>Lee, Seunggeun ; Fuchsberger, Christian ; Kim, Sehee ; Scott, Laura</creatorcontrib><description>For aggregation tests of genes or regions, the set of included variants often have small total minor allele counts (MACs), and this is particularly true when the most deleterious sets of variants are considered. When MAC is low, commonly used asymptotic tests are not well calibrated for binary phenotypes and can have conservative or anti-conservative results and potential power loss. Empirical p-values obtained via resampling methods are computationally costly for highly significant p-values and the results can be conservative due to the discrete nature of resampling tests. Based on the observation that only the individuals containing minor alleles contribute to the score statistics, we develop an efficient resampling method for single and multiple variant score-based tests that can adjust for covariates. Our method can improve computational efficiency >1000-fold over conventional resampling for low MAC variant sets. We ameliorate the conservativeness of results through the use of mid-p-values. Using the estimated minimum achievable p-value for each test, we calibrate QQ plots and provide an effective number of tests. In analysis of a case-control study with deep exome sequence, we demonstrate that our methods are both well calibrated and also reduce computation time significantly compared with resampling methods.</description><identifier>ISSN: 1465-4644</identifier><identifier>EISSN: 1468-4357</identifier><identifier>DOI: 10.1093/biostatistics/kxv033</identifier><identifier>PMID: 26363037</identifier><language>eng</language><publisher>England: Oxford Publishing Limited (England)</publisher><subject>Calibration ; Efficiency ; Genes ; Genetic Association Studies - methods ; Genetic Variation ; Genotype & phenotype ; Humans ; Models, Statistical ; Sampling ; Sequence Analysis, DNA - methods</subject><ispartof>Biostatistics (Oxford, England), 2016-01, Vol.17 (1), p.1-15</ispartof><rights>The Author 2015. Published by Oxford University Press. All rights reserved. For permissions, please e-mail: journals.permissions@oup.com.</rights><rights>Copyright Oxford Publishing Limited(England) Jan 2016</rights><rights>The Author 2015. Published by Oxford University Press. All rights reserved. For permissions, please e-mail: journals.permissions@oup.com. 2015</rights><lds50>peer_reviewed</lds50><oa>free_for_read</oa><woscitedreferencessubscribed>false</woscitedreferencessubscribed><citedby>FETCH-LOGICAL-c502t-f0e36c0a9c6b3e2665e9d7103b53a746c9946912a29fa79af3de64f86331c1a33</citedby><cites>FETCH-LOGICAL-c502t-f0e36c0a9c6b3e2665e9d7103b53a746c9946912a29fa79af3de64f86331c1a33</cites></display><links><openurl>$$Topenurl_article</openurl><openurlfulltext>$$Topenurlfull_article</openurlfulltext><thumbnail>$$Tsyndetics_thumb_exl</thumbnail><link.rule.ids>230,314,776,780,881,27901,27902</link.rule.ids><backlink>$$Uhttps://www.ncbi.nlm.nih.gov/pubmed/26363037$$D View this record in MEDLINE/PubMed$$Hfree_for_read</backlink></links><search><creatorcontrib>Lee, Seunggeun</creatorcontrib><creatorcontrib>Fuchsberger, Christian</creatorcontrib><creatorcontrib>Kim, Sehee</creatorcontrib><creatorcontrib>Scott, Laura</creatorcontrib><title>An efficient resampling method for calibrating single and gene-based rare variant association analysis in case-control studies</title><title>Biostatistics (Oxford, England)</title><addtitle>Biostatistics</addtitle><description>For aggregation tests of genes or regions, the set of included variants often have small total minor allele counts (MACs), and this is particularly true when the most deleterious sets of variants are considered. When MAC is low, commonly used asymptotic tests are not well calibrated for binary phenotypes and can have conservative or anti-conservative results and potential power loss. Empirical p-values obtained via resampling methods are computationally costly for highly significant p-values and the results can be conservative due to the discrete nature of resampling tests. Based on the observation that only the individuals containing minor alleles contribute to the score statistics, we develop an efficient resampling method for single and multiple variant score-based tests that can adjust for covariates. Our method can improve computational efficiency >1000-fold over conventional resampling for low MAC variant sets. We ameliorate the conservativeness of results through the use of mid-p-values. Using the estimated minimum achievable p-value for each test, we calibrate QQ plots and provide an effective number of tests. In analysis of a case-control study with deep exome sequence, we demonstrate that our methods are both well calibrated and also reduce computation time significantly compared with resampling methods.</description><subject>Calibration</subject><subject>Efficiency</subject><subject>Genes</subject><subject>Genetic Association Studies - methods</subject><subject>Genetic Variation</subject><subject>Genotype & phenotype</subject><subject>Humans</subject><subject>Models, Statistical</subject><subject>Sampling</subject><subject>Sequence Analysis, DNA - methods</subject><issn>1465-4644</issn><issn>1468-4357</issn><fulltext>true</fulltext><rsrctype>article</rsrctype><creationdate>2016</creationdate><recordtype>article</recordtype><recordid>eNpdkUFv1DAQhS0Eou3CP0DIEhcuoXacOPUFqapoi1SJC5ytiTPZujj24klW7YXfjrdbqrYXP8v-5o3Hj7EPUnyRwqjj3ieaYfY0e0fHv2-3QqlX7FA2-qRqVNu9vt-3VaOb5oAdEd0IUddKq7fsoNZFheoO2d_TyHEcvfMYZ56RYNoEH9d8wvk6DXxMmTsIvs-lVTmmsgTkEAe-xohVD4QDz5CRbyF7KCZAlJwveIqFg3BHnriPxYawcinOOQVO8zJ4pHfszQiB8P2Drtiv828_zy6rqx8X389OryrXinquRoFKOwHG6V5hrXWLZuikUH2roGu0M6bRRtZQmxE6A6MaUDfjiVZKOglKrdjXve9m6SccXBk2Q7Cb7CfIdzaBt89vor-267S1xbY2xWfFPj8Y5PRnQZrt5MlhCBAxLWRl1wohpNE79NML9CYtuXzEPaVKJjtZsWZPuZyIMo6Pj5HC7gK2zwK2-4BL2cengzwW_U9U_QMGu6pN</recordid><startdate>20160101</startdate><enddate>20160101</enddate><creator>Lee, Seunggeun</creator><creator>Fuchsberger, Christian</creator><creator>Kim, Sehee</creator><creator>Scott, Laura</creator><general>Oxford Publishing Limited (England)</general><general>Oxford University Press</general><scope>CGR</scope><scope>CUY</scope><scope>CVF</scope><scope>ECM</scope><scope>EIF</scope><scope>NPM</scope><scope>AAYXX</scope><scope>CITATION</scope><scope>7QO</scope><scope>8FD</scope><scope>FR3</scope><scope>K9.</scope><scope>NAPCQ</scope><scope>P64</scope><scope>RC3</scope><scope>7X8</scope><scope>5PM</scope></search><sort><creationdate>20160101</creationdate><title>An efficient resampling method for calibrating single and gene-based rare variant association analysis in case-control studies</title><author>Lee, Seunggeun ; Fuchsberger, Christian ; Kim, Sehee ; Scott, Laura</author></sort><facets><frbrtype>5</frbrtype><frbrgroupid>cdi_FETCH-LOGICAL-c502t-f0e36c0a9c6b3e2665e9d7103b53a746c9946912a29fa79af3de64f86331c1a33</frbrgroupid><rsrctype>articles</rsrctype><prefilter>articles</prefilter><language>eng</language><creationdate>2016</creationdate><topic>Calibration</topic><topic>Efficiency</topic><topic>Genes</topic><topic>Genetic Association Studies - methods</topic><topic>Genetic Variation</topic><topic>Genotype & phenotype</topic><topic>Humans</topic><topic>Models, Statistical</topic><topic>Sampling</topic><topic>Sequence Analysis, DNA - methods</topic><toplevel>peer_reviewed</toplevel><toplevel>online_resources</toplevel><creatorcontrib>Lee, Seunggeun</creatorcontrib><creatorcontrib>Fuchsberger, Christian</creatorcontrib><creatorcontrib>Kim, Sehee</creatorcontrib><creatorcontrib>Scott, Laura</creatorcontrib><collection>Medline</collection><collection>MEDLINE</collection><collection>MEDLINE (Ovid)</collection><collection>MEDLINE</collection><collection>MEDLINE</collection><collection>PubMed</collection><collection>CrossRef</collection><collection>Biotechnology Research Abstracts</collection><collection>Technology Research Database</collection><collection>Engineering Research Database</collection><collection>ProQuest Health & Medical Complete (Alumni)</collection><collection>Nursing & Allied Health Premium</collection><collection>Biotechnology and BioEngineering Abstracts</collection><collection>Genetics Abstracts</collection><collection>MEDLINE - Academic</collection><collection>PubMed Central (Full Participant titles)</collection><jtitle>Biostatistics (Oxford, England)</jtitle></facets><delivery><delcategory>Remote Search Resource</delcategory><fulltext>fulltext</fulltext></delivery><addata><au>Lee, Seunggeun</au><au>Fuchsberger, Christian</au><au>Kim, Sehee</au><au>Scott, Laura</au><format>journal</format><genre>article</genre><ristype>JOUR</ristype><atitle>An efficient resampling method for calibrating single and gene-based rare variant association analysis in case-control studies</atitle><jtitle>Biostatistics (Oxford, England)</jtitle><addtitle>Biostatistics</addtitle><date>2016-01-01</date><risdate>2016</risdate><volume>17</volume><issue>1</issue><spage>1</spage><epage>15</epage><pages>1-15</pages><issn>1465-4644</issn><eissn>1468-4357</eissn><abstract>For aggregation tests of genes or regions, the set of included variants often have small total minor allele counts (MACs), and this is particularly true when the most deleterious sets of variants are considered. When MAC is low, commonly used asymptotic tests are not well calibrated for binary phenotypes and can have conservative or anti-conservative results and potential power loss. Empirical p-values obtained via resampling methods are computationally costly for highly significant p-values and the results can be conservative due to the discrete nature of resampling tests. Based on the observation that only the individuals containing minor alleles contribute to the score statistics, we develop an efficient resampling method for single and multiple variant score-based tests that can adjust for covariates. Our method can improve computational efficiency >1000-fold over conventional resampling for low MAC variant sets. We ameliorate the conservativeness of results through the use of mid-p-values. Using the estimated minimum achievable p-value for each test, we calibrate QQ plots and provide an effective number of tests. In analysis of a case-control study with deep exome sequence, we demonstrate that our methods are both well calibrated and also reduce computation time significantly compared with resampling methods.</abstract><cop>England</cop><pub>Oxford Publishing Limited (England)</pub><pmid>26363037</pmid><doi>10.1093/biostatistics/kxv033</doi><tpages>15</tpages><oa>free_for_read</oa></addata></record> |
fulltext | fulltext |
identifier | ISSN: 1465-4644 |
ispartof | Biostatistics (Oxford, England), 2016-01, Vol.17 (1), p.1-15 |
issn | 1465-4644 1468-4357 |
language | eng |
recordid | cdi_pubmedcentral_primary_oai_pubmedcentral_nih_gov_4692986 |
source | Oxford Journals Online |
subjects | Calibration Efficiency Genes Genetic Association Studies - methods Genetic Variation Genotype & phenotype Humans Models, Statistical Sampling Sequence Analysis, DNA - methods |
title | An efficient resampling method for calibrating single and gene-based rare variant association analysis in case-control studies |
url | http://sfxeu10.hosted.exlibrisgroup.com/loughborough?ctx_ver=Z39.88-2004&ctx_enc=info:ofi/enc:UTF-8&ctx_tim=2025-02-02T20%3A03%3A40IST&url_ver=Z39.88-2004&url_ctx_fmt=infofi/fmt:kev:mtx:ctx&rfr_id=info:sid/primo.exlibrisgroup.com:primo3-Article-proquest_pubme&rft_val_fmt=info:ofi/fmt:kev:mtx:journal&rft.genre=article&rft.atitle=An%20efficient%20resampling%20method%20for%20calibrating%20single%20and%20gene-based%20rare%20variant%20association%20analysis%20in%20case-control%20studies&rft.jtitle=Biostatistics%20(Oxford,%20England)&rft.au=Lee,%20Seunggeun&rft.date=2016-01-01&rft.volume=17&rft.issue=1&rft.spage=1&rft.epage=15&rft.pages=1-15&rft.issn=1465-4644&rft.eissn=1468-4357&rft_id=info:doi/10.1093/biostatistics/kxv033&rft_dat=%3Cproquest_pubme%3E1750001966%3C/proquest_pubme%3E%3Cgrp_id%3Ecdi_FETCH-LOGICAL-c502t-f0e36c0a9c6b3e2665e9d7103b53a746c9946912a29fa79af3de64f86331c1a33%3C/grp_id%3E%3Coa%3E%3C/oa%3E%3Curl%3E%3C/url%3E&rft_id=info:oai/&rft_pqid=1753465175&rft_id=info:pmid/26363037&rfr_iscdi=true |