Loading…

Semantic Analysis of Genome Annotations using Weighting Schemes

The correct interpretation of many molecular biology experiments depends in an essential way on the accuracy and consistency of the existing annotation databases. Such databases are meant to act as repositories for our biological knowledge as we acquire and refine it. Hence, by definition they are i...

Full description

Saved in:
Bibliographic Details
Main Authors: Done, B., Khatri, P., Done, A., Draghici, S.
Format: Conference Proceeding
Language:English
Subjects:
Online Access:Request full text
Tags: Add Tag
No Tags, Be the first to tag this record!
Description
Summary:The correct interpretation of many molecular biology experiments depends in an essential way on the accuracy and consistency of the existing annotation databases. Such databases are meant to act as repositories for our biological knowledge as we acquire and refine it. Hence, by definition they are incomplete at any given time. In this paper we describe a technique that improves our previous method for extracting implicit semantic relationships between genes and functions. We added a number of weighting schemes to our previous latent semantic indexing approach. We used this technique to analyze the current annotations of the human genome. The predictions of 15 different weighting schemes were compared and evaluated. Out of the top 50 functional annotations predicted using the best performing weighting scheme, we found support in the literature for 82% of them. For 10% of our prediction we did not find any relevant publications, and 6% were actually contradicted by existing literature. This weighting scheme also outperformed the simple binary scheme used in our previous approach. Our method is independent of the organism and can be used to analyze and improve the quality of the data of any public or private annotation database
DOI:10.1109/CIBCB.2007.4221226