Loading…

High-accuracy haplotype imputation using unphased genotype data as the references

Enormously growing genomic datasets present a new challenge on missing data imputation, a notoriously resource-demanding task. Haplotype imputation requires ethnicity-matched references. However, to date, haplotype references are not available for the majority of populations in the world. We explore...

Full description

Saved in:
Bibliographic Details
Published in:Gene 2015-11, Vol.572 (2), p.279-284
Main Authors: Li, Wenzhi, Xu, Wei, Fu, Guoxing, Ma, Li, Richards, Jendai, Rao, Weinian, Bythwood, Tameka, Guo, Shiwen, Song, Qing
Format: Article
Language:English
Subjects:
Citations: Items that this one cites
Items that cite this one
Online Access:Get full text
Tags: Add Tag
No Tags, Be the first to tag this record!
Description
Summary:Enormously growing genomic datasets present a new challenge on missing data imputation, a notoriously resource-demanding task. Haplotype imputation requires ethnicity-matched references. However, to date, haplotype references are not available for the majority of populations in the world. We explored to use existing unphased genotype datasets as references; if it succeeds, it will cover almost all of the populations in the world. The results showed that our HiFi software successfully yields 99.43% accuracy with unphased genotype references. Our method provides a cost-effective solution to breakthrough the bottleneck of limited reference availability for haplotype imputation in the big data era. •The accuracy is as high as 99.43%.•It can finish a whole-genome imputation within 2min on a laptop computer.•The availability issue of ethnicity-matched references is solved.
ISSN:0378-1119
1879-0038
DOI:10.1016/j.gene.2015.07.082