Loading…

Prediction of eukaryotic gene structures based on multilevel optimization

Computational gene structure prediction, which is valuable for finding new genes and understanding the composition of genomes, plays a very important role in various kinds of genome projects. For eukaryotic gene structures, however, the prediction accuracy of existing methods is still limited. This...

Full description

Saved in:
Bibliographic Details
Published in:Chinese science bulletin 2004-02, Vol.49 (4), p.321-328
Main Authors: Zhou, Yanhong, Yang, Lei, Wang, Hui, Lu, Feng, Wan, Honghui
Format: Article
Language:English
Subjects:
Citations: Items that this one cites
Items that cite this one
Online Access:Get full text
Tags: Add Tag
No Tags, Be the first to tag this record!
Description
Summary:Computational gene structure prediction, which is valuable for finding new genes and understanding the composition of genomes, plays a very important role in various kinds of genome projects. For eukaryotic gene structures, however, the prediction accuracy of existing methods is still limited. This paper presents a method of predicting eukaryotic gene structures based on multilevel optimization. The complicated problem of predicting gene structure in eukaryotic DNA sequence containing multiple genes can be decomposed into a series of sub-problems at several levels with decreasing complexity, including the gene level (single-exon gene, multi-exon gene), the element level (exon, intron, etc.), and the feature level (functional site signals, codon usage preference, etc.). On the basis of this decomposition, a multilevel model for the prediction of complex gene structures is created by a multilevel optimization process, in which the models dealing with sub-problems at low complexity level are first optimized respectively, and then optimally combined together to form models for those sub-problems at higher complexity level. Based on the multilevel model, a dynamic programming algorithm is designed to search for optimal gene structures from DNA sequences, and a new program GeneKey (1.0) for the prediction of eukaryotic gene structures is developed. Testing results with widely used datasets demonstrate that the prediction accuracies of GeneKey (1.0) at the nucleotide level, exon level and gene level are all higher than that of the well known program GENSCAN. A web server of GeneKey(1.0) is available at http://infosci.hust.edu.cn
ISSN:1001-6538
2095-9273
1861-9541
2095-9281
DOI:10.1007/BF02900313