Loading…

Usefulness of Single Nucleotide Polymorphism Data for Estimating Population Parameters

Single nucleotide polymorphism (SNP) data can be used for parameter estimation via maximum likelihood methods as long as the way in which the SNPs were determined is known, so that an appropriate likelihood formula can be constructed. We present such likelihoods for several sampling methods. As a te...

Full description

Saved in:
Bibliographic Details
Published in:Genetics (Austin) 2000-09, Vol.156 (1), p.439-447
Main Authors: Kuhner, Mary K, Beerli, Peter, Yamato, Jon, Felsenstein, Joseph
Format: Article
Language:English
Subjects:
Citations: Items that this one cites
Items that cite this one
Online Access:Get full text
Tags: Add Tag
No Tags, Be the first to tag this record!
Description
Summary:Single nucleotide polymorphism (SNP) data can be used for parameter estimation via maximum likelihood methods as long as the way in which the SNPs were determined is known, so that an appropriate likelihood formula can be constructed. We present such likelihoods for several sampling methods. As a test of these approaches, we consider use of SNPs to estimate the parameter Theta = 4N(e)micro (the scaled product of effective population size and per-site mutation rate), which is related to the branch lengths of the reconstructed genealogy. With infinite amounts of data, ML models using SNP data are expected to produce consistent estimates of Theta. With finite amounts of data the estimates are accurate when Theta is high, but tend to be biased upward when Theta is low. If recombination is present and not allowed for in the analysis, the results are additionally biased upward, but this effect can be removed by incorporating recombination into the analysis. SNPs defined as sites that are polymorphic in the actual sample under consideration (sample SNPs) are somewhat more accurate for estimation of Theta than SNPs defined by their polymorphism in a panel chosen from the same population (panel SNPs). Misrepresenting panel SNPs as sample SNPs leads to large errors in the maximum likelihood estimate of Theta. Researchers collecting SNPs should collect and preserve information about the method of ascertainment so that the data can be accurately analyzed.
ISSN:0016-6731
1943-2631
1943-2631
DOI:10.1093/genetics/156.1.439