Loading…

A flexible and accurate genotype imputation method for the next generation of genome-wide association studies

Genotype imputation methods are now being widely used in the analysis of genome-wide association studies. Most imputation analyses to date have used the HapMap as a reference dataset, but new reference panels (such as controls genotyped on multiple SNP chips and densely typed samples from the 1,000...

Full description

Saved in:
Bibliographic Details
Published in:PLoS genetics 2009-06, Vol.5 (6), p.e1000529-e1000529
Main Authors: Howie, Bryan N, Donnelly, Peter, Marchini, Jonathan
Format: Article
Language:English
Subjects:
Citations: Items that this one cites
Items that cite this one
Online Access:Get full text
Tags: Add Tag
No Tags, Be the first to tag this record!
cited_by cdi_FETCH-LOGICAL-c813t-bf1101ead3238d154fa58137d72bfbc65a7e317c9b495a79c7383254917e398c3
cites cdi_FETCH-LOGICAL-c813t-bf1101ead3238d154fa58137d72bfbc65a7e317c9b495a79c7383254917e398c3
container_end_page e1000529
container_issue 6
container_start_page e1000529
container_title PLoS genetics
container_volume 5
creator Howie, Bryan N
Donnelly, Peter
Marchini, Jonathan
description Genotype imputation methods are now being widely used in the analysis of genome-wide association studies. Most imputation analyses to date have used the HapMap as a reference dataset, but new reference panels (such as controls genotyped on multiple SNP chips and densely typed samples from the 1,000 Genomes Project) will soon allow a broader range of SNPs to be imputed with higher accuracy, thereby increasing power. We describe a genotype imputation method (IMPUTE version 2) that is designed to address the challenges presented by these new datasets. The main innovation of our approach is a flexible modelling framework that increases accuracy and combines information across multiple reference panels while remaining computationally feasible. We find that IMPUTE v2 attains higher accuracy than other methods when the HapMap provides the sole reference panel, but that the size of the panel constrains the improvements that can be made. We also find that imputation accuracy can be greatly enhanced by expanding the reference panel to contain thousands of chromosomes and that IMPUTE v2 outperforms other methods in this setting at both rare and common SNPs, with overall error rates that are 15%-20% lower than those of the closest competing method. One particularly challenging aspect of next-generation association studies is to integrate information across multiple reference panels genotyped on different sets of SNPs; we show that our approach to this problem has practical advantages over other suggested solutions.
doi_str_mv 10.1371/journal.pgen.1000529
format article
fullrecord <record><control><sourceid>gale_plos_</sourceid><recordid>TN_cdi_plos_journals_1313549676</recordid><sourceformat>XML</sourceformat><sourcesystem>PC</sourcesystem><galeid>A203232070</galeid><doaj_id>oai_doaj_org_article_4d266fbe78fe41ba843a7640b8f9fa61</doaj_id><sourcerecordid>A203232070</sourcerecordid><originalsourceid>FETCH-LOGICAL-c813t-bf1101ead3238d154fa58137d72bfbc65a7e317c9b495a79c7383254917e398c3</originalsourceid><addsrcrecordid>eNqVk12L1DAUhoso7rr6D0QLwoIXMyZN26Q3wrD4MbC44NdtSJOTmSxpMyap7v57023VKXih5CLhnOd9E87JybKnGK0xofjVtRt8L-z6sIN-jRFCVdHcy05xVZEVLVF5_-h8kj0K4RohUrGGPsxOcFOVhFBymnWbXFu4Ma2FXPQqF1IOXkTIk6uLtwfITXcYoojG9XkHce9Urp3P4x7yHm7iyIGf0k7fqTpY_TAq2YXgpJlSIQ7KQHicPdDCBngy72fZl7dvPl-8X11evdtebC5XkmESV63GGGEQihSEKVyVWlQpQRUtWt3KuhIUCKayacsmnRtJCSNFVTY4xRsmyVn2fPI9WBf4XKnAMcEkUTWtE7GdCOXENT940wl_y50w_C7g_I4LH420wEtV1LVugTINJW4FK4mgdYlaphstapy8Xs-3DW0HSkIfvbAL02WmN3u-c995UbOmIeNjzmcD774NECLvTJBgrejBDYHXqYcMFyyBLyZwJ9LDTK9d8pMjzDcFStUqEEWJWv-FSktBZ6TrQZsUXwheLgSJiam3OzGEwLefPv4H--Hf2auvS_b8iN2DsHEfnB3G7xOWYDmB0rsQPOjfhcaIj5Pxq998nAw-T0aSPTtu0h_RPArkJzleCSE</addsrcrecordid><sourcetype>Open Website</sourcetype><iscdi>true</iscdi><recordtype>article</recordtype><pqid>67408128</pqid></control><display><type>article</type><title>A flexible and accurate genotype imputation method for the next generation of genome-wide association studies</title><source>Publicly Available Content Database</source><source>PubMed Central</source><creator>Howie, Bryan N ; Donnelly, Peter ; Marchini, Jonathan</creator><contributor>Schork, Nicholas J.</contributor><creatorcontrib>Howie, Bryan N ; Donnelly, Peter ; Marchini, Jonathan ; Schork, Nicholas J.</creatorcontrib><description>Genotype imputation methods are now being widely used in the analysis of genome-wide association studies. Most imputation analyses to date have used the HapMap as a reference dataset, but new reference panels (such as controls genotyped on multiple SNP chips and densely typed samples from the 1,000 Genomes Project) will soon allow a broader range of SNPs to be imputed with higher accuracy, thereby increasing power. We describe a genotype imputation method (IMPUTE version 2) that is designed to address the challenges presented by these new datasets. The main innovation of our approach is a flexible modelling framework that increases accuracy and combines information across multiple reference panels while remaining computationally feasible. We find that IMPUTE v2 attains higher accuracy than other methods when the HapMap provides the sole reference panel, but that the size of the panel constrains the improvements that can be made. We also find that imputation accuracy can be greatly enhanced by expanding the reference panel to contain thousands of chromosomes and that IMPUTE v2 outperforms other methods in this setting at both rare and common SNPs, with overall error rates that are 15%-20% lower than those of the closest competing method. One particularly challenging aspect of next-generation association studies is to integrate information across multiple reference panels genotyped on different sets of SNPs; we show that our approach to this problem has practical advantages over other suggested solutions.</description><identifier>ISSN: 1553-7404</identifier><identifier>ISSN: 1553-7390</identifier><identifier>EISSN: 1553-7404</identifier><identifier>DOI: 10.1371/journal.pgen.1000529</identifier><identifier>PMID: 19543373</identifier><language>eng</language><publisher>United States: Public Library of Science</publisher><subject>Accuracy ; Genetic algorithms ; Genetics ; Genetics and Genomics/Bioinformatics ; Genetics and Genomics/Genomics ; Genetics, Population ; Genome-Wide Association Study - methods ; Genomes ; Genotype ; Genotype &amp; phenotype ; Haplotypes ; Humans ; Methods ; Multiple imputation (Statistics) ; Polymorphism ; Polymorphism, Single Nucleotide ; Single nucleotide polymorphisms ; Software ; Studies</subject><ispartof>PLoS genetics, 2009-06, Vol.5 (6), p.e1000529-e1000529</ispartof><rights>COPYRIGHT 2009 Public Library of Science</rights><rights>Howie et al. 2009</rights><rights>2009 Howie et al. This is an open-access article distributed under the terms of the Creative Commons Attribution License, which permits unrestricted use, distribution, and reproduction in any medium, provided the original author and source are credited: Howie BN, Donnelly P, Marchini J (2009) A Flexible and Accurate Genotype Imputation Method for the Next Generation of Genome-Wide Association Studies. PLoS Genet 5(6): e1000529. doi:10.1371/journal.pgen.1000529</rights><lds50>peer_reviewed</lds50><oa>free_for_read</oa><woscitedreferencessubscribed>false</woscitedreferencessubscribed><citedby>FETCH-LOGICAL-c813t-bf1101ead3238d154fa58137d72bfbc65a7e317c9b495a79c7383254917e398c3</citedby><cites>FETCH-LOGICAL-c813t-bf1101ead3238d154fa58137d72bfbc65a7e317c9b495a79c7383254917e398c3</cites></display><links><openurl>$$Topenurl_article</openurl><openurlfulltext>$$Topenurlfull_article</openurlfulltext><thumbnail>$$Tsyndetics_thumb_exl</thumbnail><linktopdf>$$Uhttps://www.ncbi.nlm.nih.gov/pmc/articles/PMC2689936/pdf/$$EPDF$$P50$$Gpubmedcentral$$Hfree_for_read</linktopdf><linktohtml>$$Uhttps://www.ncbi.nlm.nih.gov/pmc/articles/PMC2689936/$$EHTML$$P50$$Gpubmedcentral$$Hfree_for_read</linktohtml><link.rule.ids>230,314,727,780,784,885,27924,27925,37013,53791,53793</link.rule.ids><backlink>$$Uhttps://www.ncbi.nlm.nih.gov/pubmed/19543373$$D View this record in MEDLINE/PubMed$$Hfree_for_read</backlink></links><search><contributor>Schork, Nicholas J.</contributor><creatorcontrib>Howie, Bryan N</creatorcontrib><creatorcontrib>Donnelly, Peter</creatorcontrib><creatorcontrib>Marchini, Jonathan</creatorcontrib><title>A flexible and accurate genotype imputation method for the next generation of genome-wide association studies</title><title>PLoS genetics</title><addtitle>PLoS Genet</addtitle><description>Genotype imputation methods are now being widely used in the analysis of genome-wide association studies. Most imputation analyses to date have used the HapMap as a reference dataset, but new reference panels (such as controls genotyped on multiple SNP chips and densely typed samples from the 1,000 Genomes Project) will soon allow a broader range of SNPs to be imputed with higher accuracy, thereby increasing power. We describe a genotype imputation method (IMPUTE version 2) that is designed to address the challenges presented by these new datasets. The main innovation of our approach is a flexible modelling framework that increases accuracy and combines information across multiple reference panels while remaining computationally feasible. We find that IMPUTE v2 attains higher accuracy than other methods when the HapMap provides the sole reference panel, but that the size of the panel constrains the improvements that can be made. We also find that imputation accuracy can be greatly enhanced by expanding the reference panel to contain thousands of chromosomes and that IMPUTE v2 outperforms other methods in this setting at both rare and common SNPs, with overall error rates that are 15%-20% lower than those of the closest competing method. One particularly challenging aspect of next-generation association studies is to integrate information across multiple reference panels genotyped on different sets of SNPs; we show that our approach to this problem has practical advantages over other suggested solutions.</description><subject>Accuracy</subject><subject>Genetic algorithms</subject><subject>Genetics</subject><subject>Genetics and Genomics/Bioinformatics</subject><subject>Genetics and Genomics/Genomics</subject><subject>Genetics, Population</subject><subject>Genome-Wide Association Study - methods</subject><subject>Genomes</subject><subject>Genotype</subject><subject>Genotype &amp; phenotype</subject><subject>Haplotypes</subject><subject>Humans</subject><subject>Methods</subject><subject>Multiple imputation (Statistics)</subject><subject>Polymorphism</subject><subject>Polymorphism, Single Nucleotide</subject><subject>Single nucleotide polymorphisms</subject><subject>Software</subject><subject>Studies</subject><issn>1553-7404</issn><issn>1553-7390</issn><issn>1553-7404</issn><fulltext>true</fulltext><rsrctype>article</rsrctype><creationdate>2009</creationdate><recordtype>article</recordtype><sourceid>DOA</sourceid><recordid>eNqVk12L1DAUhoso7rr6D0QLwoIXMyZN26Q3wrD4MbC44NdtSJOTmSxpMyap7v57023VKXih5CLhnOd9E87JybKnGK0xofjVtRt8L-z6sIN-jRFCVdHcy05xVZEVLVF5_-h8kj0K4RohUrGGPsxOcFOVhFBymnWbXFu4Ma2FXPQqF1IOXkTIk6uLtwfITXcYoojG9XkHce9Urp3P4x7yHm7iyIGf0k7fqTpY_TAq2YXgpJlSIQ7KQHicPdDCBngy72fZl7dvPl-8X11evdtebC5XkmESV63GGGEQihSEKVyVWlQpQRUtWt3KuhIUCKayacsmnRtJCSNFVTY4xRsmyVn2fPI9WBf4XKnAMcEkUTWtE7GdCOXENT940wl_y50w_C7g_I4LH420wEtV1LVugTINJW4FK4mgdYlaphstapy8Xs-3DW0HSkIfvbAL02WmN3u-c995UbOmIeNjzmcD774NECLvTJBgrejBDYHXqYcMFyyBLyZwJ9LDTK9d8pMjzDcFStUqEEWJWv-FSktBZ6TrQZsUXwheLgSJiam3OzGEwLefPv4H--Hf2auvS_b8iN2DsHEfnB3G7xOWYDmB0rsQPOjfhcaIj5Pxq998nAw-T0aSPTtu0h_RPArkJzleCSE</recordid><startdate>20090601</startdate><enddate>20090601</enddate><creator>Howie, Bryan N</creator><creator>Donnelly, Peter</creator><creator>Marchini, Jonathan</creator><general>Public Library of Science</general><general>Public Library of Science (PLoS)</general><scope>CGR</scope><scope>CUY</scope><scope>CVF</scope><scope>ECM</scope><scope>EIF</scope><scope>NPM</scope><scope>AAYXX</scope><scope>CITATION</scope><scope>IOV</scope><scope>ISN</scope><scope>ISR</scope><scope>7X8</scope><scope>5PM</scope><scope>DOA</scope></search><sort><creationdate>20090601</creationdate><title>A flexible and accurate genotype imputation method for the next generation of genome-wide association studies</title><author>Howie, Bryan N ; Donnelly, Peter ; Marchini, Jonathan</author></sort><facets><frbrtype>5</frbrtype><frbrgroupid>cdi_FETCH-LOGICAL-c813t-bf1101ead3238d154fa58137d72bfbc65a7e317c9b495a79c7383254917e398c3</frbrgroupid><rsrctype>articles</rsrctype><prefilter>articles</prefilter><language>eng</language><creationdate>2009</creationdate><topic>Accuracy</topic><topic>Genetic algorithms</topic><topic>Genetics</topic><topic>Genetics and Genomics/Bioinformatics</topic><topic>Genetics and Genomics/Genomics</topic><topic>Genetics, Population</topic><topic>Genome-Wide Association Study - methods</topic><topic>Genomes</topic><topic>Genotype</topic><topic>Genotype &amp; phenotype</topic><topic>Haplotypes</topic><topic>Humans</topic><topic>Methods</topic><topic>Multiple imputation (Statistics)</topic><topic>Polymorphism</topic><topic>Polymorphism, Single Nucleotide</topic><topic>Single nucleotide polymorphisms</topic><topic>Software</topic><topic>Studies</topic><toplevel>peer_reviewed</toplevel><toplevel>online_resources</toplevel><creatorcontrib>Howie, Bryan N</creatorcontrib><creatorcontrib>Donnelly, Peter</creatorcontrib><creatorcontrib>Marchini, Jonathan</creatorcontrib><collection>Medline</collection><collection>MEDLINE</collection><collection>MEDLINE (Ovid)</collection><collection>MEDLINE</collection><collection>MEDLINE</collection><collection>PubMed</collection><collection>CrossRef</collection><collection>Gale_Opposing Viewpoints In Context</collection><collection>Gale In Context: Canada</collection><collection>Gale In Context: Science</collection><collection>MEDLINE - Academic</collection><collection>PubMed Central (Full Participant titles)</collection><collection>DOAJ Directory of Open Access Journals</collection><jtitle>PLoS genetics</jtitle></facets><delivery><delcategory>Remote Search Resource</delcategory><fulltext>fulltext</fulltext></delivery><addata><au>Howie, Bryan N</au><au>Donnelly, Peter</au><au>Marchini, Jonathan</au><au>Schork, Nicholas J.</au><format>journal</format><genre>article</genre><ristype>JOUR</ristype><atitle>A flexible and accurate genotype imputation method for the next generation of genome-wide association studies</atitle><jtitle>PLoS genetics</jtitle><addtitle>PLoS Genet</addtitle><date>2009-06-01</date><risdate>2009</risdate><volume>5</volume><issue>6</issue><spage>e1000529</spage><epage>e1000529</epage><pages>e1000529-e1000529</pages><issn>1553-7404</issn><issn>1553-7390</issn><eissn>1553-7404</eissn><abstract>Genotype imputation methods are now being widely used in the analysis of genome-wide association studies. Most imputation analyses to date have used the HapMap as a reference dataset, but new reference panels (such as controls genotyped on multiple SNP chips and densely typed samples from the 1,000 Genomes Project) will soon allow a broader range of SNPs to be imputed with higher accuracy, thereby increasing power. We describe a genotype imputation method (IMPUTE version 2) that is designed to address the challenges presented by these new datasets. The main innovation of our approach is a flexible modelling framework that increases accuracy and combines information across multiple reference panels while remaining computationally feasible. We find that IMPUTE v2 attains higher accuracy than other methods when the HapMap provides the sole reference panel, but that the size of the panel constrains the improvements that can be made. We also find that imputation accuracy can be greatly enhanced by expanding the reference panel to contain thousands of chromosomes and that IMPUTE v2 outperforms other methods in this setting at both rare and common SNPs, with overall error rates that are 15%-20% lower than those of the closest competing method. One particularly challenging aspect of next-generation association studies is to integrate information across multiple reference panels genotyped on different sets of SNPs; we show that our approach to this problem has practical advantages over other suggested solutions.</abstract><cop>United States</cop><pub>Public Library of Science</pub><pmid>19543373</pmid><doi>10.1371/journal.pgen.1000529</doi><oa>free_for_read</oa></addata></record>
fulltext fulltext
identifier ISSN: 1553-7404
ispartof PLoS genetics, 2009-06, Vol.5 (6), p.e1000529-e1000529
issn 1553-7404
1553-7390
1553-7404
language eng
recordid cdi_plos_journals_1313549676
source Publicly Available Content Database; PubMed Central
subjects Accuracy
Genetic algorithms
Genetics
Genetics and Genomics/Bioinformatics
Genetics and Genomics/Genomics
Genetics, Population
Genome-Wide Association Study - methods
Genomes
Genotype
Genotype & phenotype
Haplotypes
Humans
Methods
Multiple imputation (Statistics)
Polymorphism
Polymorphism, Single Nucleotide
Single nucleotide polymorphisms
Software
Studies
title A flexible and accurate genotype imputation method for the next generation of genome-wide association studies
url http://sfxeu10.hosted.exlibrisgroup.com/loughborough?ctx_ver=Z39.88-2004&ctx_enc=info:ofi/enc:UTF-8&ctx_tim=2025-01-05T10%3A34%3A37IST&url_ver=Z39.88-2004&url_ctx_fmt=infofi/fmt:kev:mtx:ctx&rfr_id=info:sid/primo.exlibrisgroup.com:primo3-Article-gale_plos_&rft_val_fmt=info:ofi/fmt:kev:mtx:journal&rft.genre=article&rft.atitle=A%20flexible%20and%20accurate%20genotype%20imputation%20method%20for%20the%20next%20generation%20of%20genome-wide%20association%20studies&rft.jtitle=PLoS%20genetics&rft.au=Howie,%20Bryan%20N&rft.date=2009-06-01&rft.volume=5&rft.issue=6&rft.spage=e1000529&rft.epage=e1000529&rft.pages=e1000529-e1000529&rft.issn=1553-7404&rft.eissn=1553-7404&rft_id=info:doi/10.1371/journal.pgen.1000529&rft_dat=%3Cgale_plos_%3EA203232070%3C/gale_plos_%3E%3Cgrp_id%3Ecdi_FETCH-LOGICAL-c813t-bf1101ead3238d154fa58137d72bfbc65a7e317c9b495a79c7383254917e398c3%3C/grp_id%3E%3Coa%3E%3C/oa%3E%3Curl%3E%3C/url%3E&rft_id=info:oai/&rft_pqid=67408128&rft_id=info:pmid/19543373&rft_galeid=A203232070&rfr_iscdi=true