Loading…
Exome sequence genotype imputation in globally diverse hexaploid wheat accessions
Key message Imputing genotypes from the 90K SNP chip to exome sequence in wheat was moderately accurate. We investigated the factors that affect imputation and propose several strategies to improve accuracy. Imputing genetic marker genotypes from low to high density has been proposed as a cost-effec...
Saved in:
Published in: | Theoretical and applied genetics 2017-07, Vol.130 (7), p.1393-1404 |
---|---|
Main Authors: | , , , , , , , , , , , , , , |
Format: | Article |
Language: | English |
Subjects: | |
Citations: | Items that this one cites Items that cite this one |
Online Access: | Get full text |
Tags: |
Add Tag
No Tags, Be the first to tag this record!
|
Summary: | Key message
Imputing genotypes from the 90K SNP chip to exome sequence in wheat was moderately accurate. We investigated the factors that affect imputation and propose several strategies to improve accuracy.
Imputing genetic marker genotypes from low to high density has been proposed as a cost-effective strategy to increase the power of downstream analyses (e.g. genome-wide association studies and genomic prediction) for a given budget. However, imputation is often imperfect and its accuracy depends on several factors. Here, we investigate the effects of reference population selection algorithms, marker density and imputation algorithms (Beagle4 and FImpute) on the accuracy of imputation from low SNP density (9K array) to the Infinium 90K single-nucleotide polymorphism (SNP) array for a collection of 837 hexaploid wheat Watkins landrace accessions. Based on these results, we then used the best performing reference selection and imputation algorithms to investigate imputation from 90K to exome sequence for a collection of 246 globally diverse wheat accessions. Accession-to-nearest-entry and genomic relationship-based methods were the best performing selection algorithms, and FImpute resulted in higher accuracy and was more efficient than Beagle4. The accuracy of imputing exome capture SNPs was comparable to imputing from 9 to 90K at approximately 0.71. This relatively low imputation accuracy is in part due to inconsistency between 90K and exome sequence formats. We also found the accuracy of imputation could be substantially improved to 0.82 when choosing an equivalent number of exome SNP, instead of 90K SNPs on the existing array, as the lower density set. We present a number of recommendations to increase the accuracy of exome imputation. |
---|---|
ISSN: | 0040-5752 1432-2242 |
DOI: | 10.1007/s00122-017-2895-3 |