Loading…
An integrated, ontology-driven approach to constructing observational databases for research
[Display omitted] •Incomplete and inconsistent data limit the use of observational data in research.•An ontology-driven framework is proposed for extracting and representing data.•The ontology facilitates standardization, data extraction, and semantic retrieval.•Two examples illustrate how the ontol...
Saved in:
Published in: | Journal of biomedical informatics 2015-06, Vol.55, p.132-142 |
---|---|
Main Authors: | , , , , , , |
Format: | Article |
Language: | English |
Subjects: | |
Citations: | Items that this one cites Items that cite this one |
Online Access: | Get full text |
Tags: |
Add Tag
No Tags, Be the first to tag this record!
|
Summary: | [Display omitted]
•Incomplete and inconsistent data limit the use of observational data in research.•An ontology-driven framework is proposed for extracting and representing data.•The ontology facilitates standardization, data extraction, and semantic retrieval.•Two examples illustrate how the ontology supports the analytic workflow.
The electronic health record (EHR) contains a diverse set of clinical observations that are captured as part of routine care, but the incomplete, inconsistent, and sometimes incorrect nature of clinical data poses significant impediments for its secondary use in retrospective studies or comparative effectiveness research. In this work, we describe an ontology-driven approach for extracting and analyzing data from the patient record in a longitudinal and continuous manner. We demonstrate how the ontology helps enforce consistent data representation, integrates phenotypes generated through analyses of available clinical data sources, and facilitates subsequent studies to identify clinical predictors for an outcome of interest. Development and evaluation of our approach are described in the context of studying factors that influence intracranial aneurysm (ICA) growth and rupture. We report our experiences in capturing information on 78 individuals with a total of 120 aneurysms. Two example applications related to assessing the relationship between aneurysm size, growth, gene expression modules, and rupture are described. Our work highlights the challenges with respect to data quality, workflow, and analysis of data and its implications toward a learning health system paradigm. |
---|---|
ISSN: | 1532-0464 1532-0480 |
DOI: | 10.1016/j.jbi.2015.03.008 |