Loading…

Comparison of Imputation Strategies for Incomplete Longitudinal Data in Life-Course Epidemiology

Abstract Incomplete longitudinal data are common in life-course epidemiology and may induce bias leading to incorrect inference. Multiple imputation (MI) is increasingly preferred for handling missing data, but few studies explore MI-method performance and feasibility in real-data settings. We compa...

Full description

Saved in:

Bibliographic Details
Published in:	American journal of epidemiology 2023-11, Vol.192 (12), p.2075-2084
Main Authors:	Shaw, Crystal, Wu, Yingyan, Zimmerman, Scott C, Hayes-Larson, Eleanor, Belin, Thomas R, Power, Melinda C, Glymour, M Maria, Mayeda, Elizabeth Rose
Format:	Article
Language:	English
Subjects:	Bias Computation Computer Simulation Data Interpretation, Statistical Epidemiology Humans Linear Models Matching Mean square errors Mental depression Missing data Mortality Practice of Epidemiology Proportional Hazards Models Research Design Root-mean-square errors Statistical models
Citations:	Items that this one cites
Online Access:	Get full text
Tags:	Add Tag No Tags, Be the first to tag this record!

Description
Summary:	Abstract Incomplete longitudinal data are common in life-course epidemiology and may induce bias leading to incorrect inference. Multiple imputation (MI) is increasingly preferred for handling missing data, but few studies explore MI-method performance and feasibility in real-data settings. We compared 3 MI methods using real data under 9 missing-data scenarios, representing combinations of 10%, 20%, and 30% missingness and missing completely at random, at random, and not at random. Using data from Health and Retirement Study (HRS) participants, we introduced record-level missingness to a sample of participants with complete data on depressive symptoms (1998–2008), mortality (2008–2018), and relevant covariates. We then imputed missing data using 3 MI methods (normal linear regression, predictive mean matching, variable-tailored specification), and fitted Cox proportional hazards models to estimate effects of 4 operationalizations of longitudinal depressive symptoms on mortality. We compared bias in hazard ratios, root mean square error, and computation time for each method. Bias was similar across MI methods, and results were consistent across operationalizations of the longitudinal exposure variable. However, our results suggest that predictive mean matching may be an appealing strategy for imputing life-course exposure data, given consistently low root mean square error, competitive computation times, and few implementation challenges.
ISSN:	0002-9262 1476-6256 1476-6256
DOI:	10.1093/aje/kwad139