Loading…

A pipeline for harmonising NHS Scotland laboratory data to enable national-level analyses

Medical laboratory data together with prescribing and hospitalisation records are three of the most used electronic health records (EHRs) for data-driven health research. In Scotland, hospitalisation, prescribing and the death register data are available nationally whereas laboratory data is capture...

Full description

Saved in:
Bibliographic Details
Published in:Journal of biomedical informatics 2025-01, p.104771, Article 104771
Main Authors: Gao, Chuang, Mumtaz, Shahzad, McCall, Sophie, O'Sullivan, Katherine, McGilchrist, Mark, Morales, Daniel R, Hall, Christopher, Wilde, Katie, Mayor, Charlie, Linksted, Pamela, Harrison, Kathy, Cole, Christian, Jefferson, Emily
Format: Article
Language:English
Citations: Items that this one cites
Online Access:Get full text
Tags: Add Tag
No Tags, Be the first to tag this record!
Description
Summary:Medical laboratory data together with prescribing and hospitalisation records are three of the most used electronic health records (EHRs) for data-driven health research. In Scotland, hospitalisation, prescribing and the death register data are available nationally whereas laboratory data is captured, stored and reported from local health board systems with significant heterogeneity. For researchers or other users of this regionally curated data, working on laboratory datasets across regional cohorts requires effort and time. As part of this study, the Scottish Safe Haven Network have developed an open-source software pipeline to generate a harmonised laboratory dataset. We obtained sample laboratory data from the four regional Safe Havens in Scotland covering people within the SHARE consented cohort. We compared the variables collected by each regional Safe Haven and mapped these to 11 FHIR and 2 Scottish-specific standardised terms (i.e., one to indicate the regional health board and a second to describe the source clinical code description) RESULTS: We compared the laboratory data and found that 180 test codes covered 98.7 % of test records performed across Scotland. Focusing on the 180 test codes, we developed a set of transformations to convert test results captured in different units to the same unit. We included both Read Codes and SNOMED CT to encode the tests within the pipeline. We validated our harmonisation pipeline by comparing the results across the different regional datasets. The pipeline can be reused by researchers and/or Safe Havens to generate clean, harmonised laboratory data at a national level with minimal effort.
ISSN:1532-0464
1532-0480
1532-0480
DOI:10.1016/j.jbi.2024.104771