Loading…

Knowledge-based voting algorithm for automated protein functional annotation

Automated annotation of high‐throughput genome sequences is one of the earliest steps toward a comprehensive understanding of the dynamic behavior of living organisms. However, the step is often error‐prone because of its underlying algorithms, which rely mainly on a simple similarity analysis, and...

Full description

Saved in:
Bibliographic Details
Published in:Proteins, structure, function, and bioinformatics structure, function, and bioinformatics, 2005-12, Vol.61 (4), p.907-917
Main Authors: Yu, G.X., Glass, E.M., Karonis, N.T., Maltsev, N.
Format: Article
Language:English
Subjects:
Citations: Items that this one cites
Items that cite this one
Online Access:Get full text
Tags: Add Tag
No Tags, Be the first to tag this record!
Description
Summary:Automated annotation of high‐throughput genome sequences is one of the earliest steps toward a comprehensive understanding of the dynamic behavior of living organisms. However, the step is often error‐prone because of its underlying algorithms, which rely mainly on a simple similarity analysis, and lack of guidance from biological rules. We present herein a knowledge‐based protein annotation algorithm. Our objectives are to reduce errors and to improve annotation confidences. This algorithm consists of two major components: a knowledge system, called “RuleMiner,” and a voting procedure. The knowledge system, which includes biological rules and functional profiles for each function, provides a platform for seamless integration of multiple sequence analysis tools and guidance for function annotation. The voting procedure, which relies on the knowledge system, is designed to make (possibly) unbiased judgments in functional assignments among complicated, sometimes conflicting, information. We have applied this algorithm to 10 prokaryotic bacterial genomes and observed a significant improvement in annotation confidences. We also discuss the current limitations of the algorithm and the potential for future improvement. Proteins 2005. © 2005 Wiley‐Liss, Inc.
ISSN:0887-3585
1097-0134
DOI:10.1002/prot.20652