Loading…

Learning latent heterogeneity for type 2 diabetes patients using longitudinal health markers in electronic health records

Electronic health records (EHRs) from type 2 diabetes (T2D) patients consist of longitudinally and sparsely measured health markers at clinical encounters. Our goal is to use such data to learn latent patterns that can inform patient's health status related to T2D while accounting for challenge...

Full description

Saved in:
Bibliographic Details
Published in:Statistics in medicine 2021-04, Vol.40 (8), p.1930-1946
Main Authors: Lou, Jitong, Wang, Yuanjia, Li, Lang, Zeng, Donglin
Format: Article
Language:English
Subjects:
Citations: Items that this one cites
Items that cite this one
Online Access:Get full text
Tags: Add Tag
No Tags, Be the first to tag this record!
Description
Summary:Electronic health records (EHRs) from type 2 diabetes (T2D) patients consist of longitudinally and sparsely measured health markers at clinical encounters. Our goal is to use such data to learn latent patterns that can inform patient's health status related to T2D while accounting for challenges in retrospectively collected EHRs. To handle challenges such as correlated longitudinal measurements, irregular and informative encounter times, and mixed marker types, we propose multivariate generalized linear models to learn latent patient subgroups. In our model, covariate effects were time‐dependent and latent Gaussian processes were introduced to model between‐marker correlations over time. Using inferred latent processes, we integrated the irregularly measured health markers of mixed types into composite scores and applied hierarchical clustering to learn latent subgroup structures among T2D patients. Application to an EHR dataset of T2D patients showed different trends of age, sex, and race effects on hypertension/high blood pressure, total cholesterol, glycated hemoglobin, high‐density lipoprotein, and medications. The associations among these markers varied over time during the study window. Clustering results revealed four subgroups, each with distinct health status. The same patterns were further confirmed using new EHR records of the same cohort. We developed a novel latent model to integrate longitudinal health markers in EHRs and characterize patient latent heterogeneities. Analysis indicated that there were distinct subgroups of T2D patients, suggesting that effective healthcare managements for these patients should be performed separately for each subgroup.
ISSN:0277-6715
1097-0258
DOI:10.1002/sim.8880