Loading…

Learning rule-based models of biological process fromgene expression time profiles using Gene Ontology

Motivation: Microarray technology enables large-scale inference of the participation of genes in biological process from similar expression profiles. Our aim is to induce classificatory models from expression data and biological knowledge that can automatically associate genes with novel hypotheses...

Full description

Saved in:
Bibliographic Details
Published in:Bioinformatics 2003-06, Vol.19 (9), p.1116-1123
Main Authors: Hvidsten, Torgeir R., Lægreid, Astrid, Komorowski, Jan
Format: Article
Language:English
Online Access:Get full text
Tags: Add Tag
No Tags, Be the first to tag this record!
Description
Summary:Motivation: Microarray technology enables large-scale inference of the participation of genes in biological process from similar expression profiles. Our aim is to induce classificatory models from expression data and biological knowledge that can automatically associate genes with novel hypotheses of biological process Results: We report a systematic supervised learning approach to predicting biological process from time series of gene expression data and biological knowledge. Biological knowledge is expressed using gene ontology and this knowledge is associated with discriminatory expression-based features to form minimal decision rules. The resulting rule model is first evaluated on genes coding for proteins with known biological process roles using cross validation. Then it is used to generate hypotheses for genes for which no knowledge of participation in biological process could be found. The theoretical foundation for the methodology based on rough sets is outlined in the paper, and its practical application demonstrated on a data set previously published by Cho et al. (Nat. Genet., 27, 48–54, 2001) Availability: The Rosetta system is available at http://www.idi.ntnu.no/~aleks/rosetta Contact: Jan.Komorowski@lcb.uu.se Supplementary Information: http://www.lcb.uu.se/~hvidsten/bioinf_cho/ * To whom correspondence should be addressed.
ISSN:1367-4803
1460-2059
DOI:10.1093/bioinformatics/btg047