Loading…

A causal data fusion method for the general exposure and outcome

Abstract With the advent of the big data era, the need to combine multiple individual data sets to draw causal effects arises naturally in many medical and biological applications. Especially each data set cannot measure enough confounders to infer the causal effect of an exposure on an outcome. In...

Full description

Saved in:
Bibliographic Details
Published in:Statistics in medicine 2021-11, Vol.41 (2)
Main Authors: Li, Hongkai, Jia, Jinzhu, Yan, Ran, Xue, Fuzhong, Geng, Zhi
Format: Article
Language:English
Online Access:Get full text
Tags: Add Tag
No Tags, Be the first to tag this record!
Description
Summary:Abstract With the advent of the big data era, the need to combine multiple individual data sets to draw causal effects arises naturally in many medical and biological applications. Especially each data set cannot measure enough confounders to infer the causal effect of an exposure on an outcome. In this article, we extend the method proposed by a previous study to causal data fusion of more than two data sets without external validation and to a more general (continuous or discrete) exposure and outcome. Theoretically, we obtain the condition for identifiability of exposure effects using multiple individual data sources for the continuous or discrete exposure and outcome. The simulation results show that our proposed causal data fusion method has unbiased causal effect estimate and higher precision than traditional regression, meta‐analysis and statistical matching methods. We further apply our method to study the causal effect of BMI on glucose level in individuals with diabetes by combining two data sets. Our method is essential for causal data fusion and provides important insights into the ongoing discourse on the empirical analysis of merging multiple individual data sources.
ISSN:0277-6715
1097-0258