Loading…

Efficient data management in a large-scale epidemiology research project

Abstract This article describes the concept of a “Central Data Management” (CDM) and its implementation within the large-scale population-based medical research project “Personalized Medicine”. The CDM can be summarized as a conjunction of data capturing, data integration, data storage, data refinem...

Full description

Saved in:
Bibliographic Details
Published in:Computer methods and programs in biomedicine 2012-09, Vol.107 (3), p.425-435
Main Authors: Meyer, Jens, Ostrzinski, Stefan, Fredrich, Daniel, Havemann, Christoph, Krafczyk, Janina, Hoffmann, Wolfgang
Format: Article
Language:English
Subjects:
Citations: Items that this one cites
Items that cite this one
Online Access:Get full text
Tags: Add Tag
No Tags, Be the first to tag this record!
Description
Summary:Abstract This article describes the concept of a “Central Data Management” (CDM) and its implementation within the large-scale population-based medical research project “Personalized Medicine”. The CDM can be summarized as a conjunction of data capturing, data integration, data storage, data refinement, and data transfer. A wide spectrum of reliable “Extract Transform Load” (ETL) software for automatic integration of data as well as “electronic Case Report Forms” (eCRFs) was developed, in order to integrate decentralized and heterogeneously captured data. Due to the high sensitivity of the captured data, high system resource availability, data privacy, data security and quality assurance are of utmost importance. A complex data model was developed and implemented using an Oracle database in high availability cluster mode in order to integrate different types of participant-related data. Intelligent data capturing and storage mechanisms are improving the quality of data. Data privacy is ensured by a multi-layered role/right system for access control and de-identification of identifying data. A well defined backup process prevents data loss. Over the period of one and a half year, the CDM has captured a wide variety of data in the magnitude of approximately 5 terabytes without experiencing any critical incidents of system breakdown or loss of data. The aim of this article is to demonstrate one possible way of establishing a Central Data Management in large-scale medical and epidemiological studies.
ISSN:0169-2607
1872-7565
DOI:10.1016/j.cmpb.2010.12.016