Loading…

Representing genetic sequence data for pharmacogenomics: anevolutionary approach using ontological and relational models

Motivation: The information model chosen to store biological data affects the types of queries possible, database performance, and difficulty in updating that information model. Genetic sequence data for pharmacogenetics studies can be complex, and the best information model to use may change over t...

Full description

Saved in:
Bibliographic Details
Published in:Bioinformatics 2002-07, Vol.18 (suppl-1), p.S207-S215
Main Authors: Rubin, Daniel L., Shafa, Farhad, Oliver, Diane E., Hewett, Micheal, Altman, Russ B.
Format: Article
Language:English
Online Access:Get full text
Tags: Add Tag
No Tags, Be the first to tag this record!
Description
Summary:Motivation: The information model chosen to store biological data affects the types of queries possible, database performance, and difficulty in updating that information model. Genetic sequence data for pharmacogenetics studies can be complex, and the best information model to use may change over time. As experimental and analytical methods change, and as biological knowledge advances, the data storage requirements and types of queries needed may also change. Results: We developed a model for genetic sequence and polymorphism data, and used XML Schema to specify the elements and attributes required for this model. We implemented this model as an ontology in a frame-based representation and as a relational model in a database system. We collected genetic data from two pharmacogenetics resequencing studies, and formulated queries useful for analysing these data. We compared the ontology and relational models in terms of query complexity, performance, and difficulty in changing the information model. Our results demonstrate benefits of evolving the schema for storing pharmacogenetics data: ontologies perform well in early design stages as the information model changes rapidly and simplify query formulation, while relational models offer improved query speed once the information model and types of queries needed stabilize. Availability: Our ontology and relational models are available at http://smi-web.stanford.edu/projects/helix/pubs/ismb02/. Contact: rubin@smi.stanford.edu russ.altman@stanford.edu help@pharmgkb.org Keywords: ontologies; relational databases; schema; data models; pharmacogenomics.
ISSN:1367-4803
1460-2059
DOI:10.1093/bioinformatics/18.suppl_1.S207