Loading…

Analyses and comparison of accuracy of different genotype imputation methods

The power of genetic association analyses is often compromised by missing genotypic data which contributes to lack of significant findings, e.g., in in silico replication studies. One solution is to impute untyped SNPs from typed flanking markers, based on known linkage disequilibrium (LD) relations...

Full description

Saved in:
Bibliographic Details
Published in:PloS one 2008-10, Vol.3 (10), p.e3551-e3551
Main Authors: Pei, Yu-Fang, Li, Jian, Zhang, Lei, Papasian, Christopher J, Deng, Hong-Wen
Format: Article
Language:English
Subjects:
Citations: Items that this one cites
Items that cite this one
Online Access:Get full text
Tags: Add Tag
No Tags, Be the first to tag this record!
Description
Summary:The power of genetic association analyses is often compromised by missing genotypic data which contributes to lack of significant findings, e.g., in in silico replication studies. One solution is to impute untyped SNPs from typed flanking markers, based on known linkage disequilibrium (LD) relationships. Several imputation methods are available and their usefulness in association studies has been demonstrated, but factors affecting their relative performance in accuracy have not been systematically investigated. Therefore, we investigated and compared the performance of five popular genotype imputation methods, MACH, IMPUTE, fastPHASE, PLINK and Beagle, to assess and compare the effects of factors that affect imputation accuracy rates (ARs). Our results showed that a stronger LD and a lower MAF for an untyped marker produced better ARs for all the five methods. We also observed that a greater number of haplotypes in the reference sample resulted in higher ARs for MACH, IMPUTE, PLINK and Beagle, but had little influence on the ARs for fastPHASE. In general, MACH and IMPUTE produced similar results and these two methods consistently outperformed fastPHASE, PLINK and Beagle. Our study is helpful in guiding application of imputation methods in association analyses when genotype data are missing.
ISSN:1932-6203
1932-6203
DOI:10.1371/journal.pone.0003551