Loading…

Evolutionary HMMs: a Bayesian approach to multiple alignment

Motivation: We review proposed syntheses of probabilistic sequence alignment, profiling and phylogeny. We develop a multiple alignment algorithm for Bayesian inference in the links model proposed by Thorne et al. (1991, J. Mol. Evol. , 33, 114–124). The algorithm, described in detail in Section 3, s...

Full description

Saved in:
Bibliographic Details
Published in:Bioinformatics 2001-09, Vol.17 (9), p.803-820
Main Authors: Holmes, Ian, Bruno, William J.
Format: Article
Language:English
Subjects:
Citations: Items that cite this one
Online Access:Get full text
Tags: Add Tag
No Tags, Be the first to tag this record!
Description
Summary:Motivation: We review proposed syntheses of probabilistic sequence alignment, profiling and phylogeny. We develop a multiple alignment algorithm for Bayesian inference in the links model proposed by Thorne et al. (1991, J. Mol. Evol. , 33, 114–124). The algorithm, described in detail in Section 3, samples from and/or maximizes the posterior distribution over multiple alignments for any number of DNA or protein sequences, conditioned on a phylogenetic tree. The individual sampling and maximization steps of the algorithm require no more computational resources than pairwise alignment. Methods: We present a software implementation (Handel) of our algorithm and report test results on (i) simulated data sets and (ii) the structurally informed protein alignments of BAliBASE (Thompson et al. , 1999, Nucleic Acids Res. , 27, 2682–2690). Results: We find that the mean sum-of-pairs score (a measure of residue-pair correspondence) for the BAliBASE alignments is only 13% lower for Handelthan for CLUSTALW(Thompson et al. , 1994, Nucleic Acids Res. , 22, 4673–4680), despite the relative simplicity of the links model (CLUSTALW uses affine gap scores and increased penalties for indels in hydrophobic regions). With reference to these benchmarks, we discuss potential improvements to the links model and implications for Bayesian multiple alignment and phylogenetic profiling. Availability: The source code to Handelis freely distributed on the Internet at http://www.biowiki.org/Handel under the terms of the GNU Public License (GPL, 2000, http://www.fsf.org./copyleft/gpl.html). Contact: ihh@fruitfly.org
ISSN:1367-4803
1460-2059
1367-4811
DOI:10.1093/bioinformatics/17.9.803