Loading…

Computation of the probabilities of families of biological sequences

An algorithm for computing the probabilities of biological sequences is presented. The algorithm is applicable to many problems of bioinformatics, in particular, computing seed sensitivity in the search for local similarities in genomes or estimating the reliability of search for clusters of regulat...

Full description

Saved in:
Bibliographic Details
Published in:Biophysics (Oxford) 2009-10, Vol.54 (5), p.569-573
Main Author: Roytberg, M. A.
Format: Article
Language:English
Subjects:
Citations: Items that this one cites
Items that cite this one
Online Access:Get full text
Tags: Add Tag
No Tags, Be the first to tag this record!
Description
Summary:An algorithm for computing the probabilities of biological sequences is presented. The algorithm is applicable to many problems of bioinformatics, in particular, computing seed sensitivity in the search for local similarities in genomes or estimating the reliability of search for clusters of regulatory sites. It can be also used for distributions of probabilities described by different models, e.g., Bernoulli, Markov, and hidden Markov models. The algorithm is based on the description of probability distribution as well as of the family of sequences using finite automata, whereby the problem of calculating the probabilities is reduced to computing an appropriate generalized partition function. The algorithm can be applied not only to biological sequences but also to symbol sequences of any origin.
ISSN:0006-3509
1555-6654
DOI:10.1134/S0006350909050029