Loading…

Complex Models of Sequence Evolution Require Accurate Estimators as Exemplified with the Invariable Site Plus Gamma Model

The invariable site plus Γ model (I+Γ) is widely used to model rate heterogeneity among alignment sites in maximum likelihood and Bayesian phylogenetic analyses. The proof that the I+ continuous Γ model is identifiable (model parameters can be inferred correctly given enough data) has increased the...

Full description

Saved in:
Bibliographic Details
Published in:Systematic biology 2018-05, Vol.67 (3), p.552-558
Main Authors: Nguyen, Lam-Tung, von Haeseler, Arndt, Minh, Bui Quang
Format: Article
Language:English
Subjects:
Citations: Items that this one cites
Items that cite this one
Online Access:Get full text
Tags: Add Tag
No Tags, Be the first to tag this record!
Description
Summary:The invariable site plus Γ model (I+Γ) is widely used to model rate heterogeneity among alignment sites in maximum likelihood and Bayesian phylogenetic analyses. The proof that the I+ continuous Γ model is identifiable (model parameters can be inferred correctly given enough data) has increased the creditability of its application to phylogeny reconstruction. However, most phylogenetic software implement the I+ discrete Γ model, whose identifiability is likely but unproven. How well the parameters of the I+ discrete Γ model are estimated is still disputed. Especially the correlation between the fraction of invariable sites and the fractions of sites with a slow evolutionary rate is discussed as being problematic. We show that optimization heuristics as implemented in frequently used phylogenetic software (PhyML, RAxML, IQ-TREE, and MrBayes) cannot always reliably estimate the shape parameter, the proportion of invariable sites, and the tree length. Here, we propose an improved optimization heuristic that accurately estimates the three parameters. While research efforts mainly focus on tree search methods, our results signify the equal importance of verifying and developing effective estimation methods for complex models of sequence evolution.
ISSN:1063-5157
1076-836X
DOI:10.1093/sysbio/syx092