Loading…

New approaches for reconstructing phylogenies from gene order data

We report on new techniques we have developed for reconstructing phylogenies on whole genomes. Our mathematical techniques include new polynomial-time methods for bounding the inversion length of a candidate tree and new polynomial-time methods for estimating genomic distances which greatly improve...

Full description

Saved in:
Bibliographic Details
Published in:Bioinformatics 2001-06, Vol.17 (suppl-1), p.S165-S173
Main Authors: Moret, Bernard M.E., Wang, Li-San, Warnow, Tandy, Wyman, Stacia K.
Format: Article
Language:English
Subjects:
Citations: Items that cite this one
Online Access:Get full text
Tags: Add Tag
No Tags, Be the first to tag this record!
Description
Summary:We report on new techniques we have developed for reconstructing phylogenies on whole genomes. Our mathematical techniques include new polynomial-time methods for bounding the inversion length of a candidate tree and new polynomial-time methods for estimating genomic distances which greatly improve the accuracy of neighbor-joining analyses. We demonstrate the power of these techniques through an extensive performance study based on simulating genome evolution under a wide range of model conditions. Combining these new tools with standard approaches (fast reconstruction with neighbor-joining, exploration of all possible refinements of strict consensus trees, etc.) has allowed us to analyze datasets that were previously considered computationally impractical. In particular, we have conducted a complete phylogenetic analysis of a subset of the Campanulaceae family, confirming various conjectures about the relationships among members of the subset and about the principal mechanism of evolution for their chloroplast genome. We give representative results of the extensive experimentation we conducted on both real and simulated datasets in order to validate and characterize our approaches. We find that our techniques provide very accurate reconstructions of the true tree topology even when the data are generated by processes that include a significant fraction of transpositions and when the data are close to saturation. Contact: moret@cs.unm.eduor tandy@cs.utexas.edu
ISSN:1367-4803
1460-2059
1367-4811
DOI:10.1093/bioinformatics/17.suppl_1.S165