Loading…

Advances in Biomedical Missing Data Imputation: A Survey

Ensuring good data quality in biomedical sciences is crucial for reliable research outcomes, particularly as precision medicine continues to gain prominence. Missing values compromise data quality and can difficult to perform data-based studies. The origins of missing values in biomedical datasets a...

Full description

Saved in:
Bibliographic Details
Published in:IEEE access 2024-12, p.1-1
Main Authors: Barrabes, Miriam, Perera, Maria, Moriano, Victor Novelle, Giro-I-Nieto, Xavier, Montserrat, Daniel Mas, Ioannidis, Alexander G.
Format: Article
Language:English
Subjects:
Online Access:Get full text
Tags: Add Tag
No Tags, Be the first to tag this record!
Description
Summary:Ensuring good data quality in biomedical sciences is crucial for reliable research outcomes, particularly as precision medicine continues to gain prominence. Missing values compromise data quality and can difficult to perform data-based studies. The origins of missing values in biomedical datasets are diverse, including experimental errors, equipment malfunctions, and variations in data collection protocols tailored to individual patient conditions. To address the complex nature of missing values and the unique characteristics of biomedical data, a diverse spectrum of computational imputation techniques has emerged. These methods range from traditional statistical analysis to more modern approaches such as discriminative machine learning models and deep generative networks. This survey paper provides a comprehensive overview of the extensive literature on missing data imputation techniques, with a specific focus on applications in genomics, single-cell RNA sequencing, health records, and medical imaging. We outline the fundamental principles underlying each imputation technique and present a detailed analysis of their advantages and disadvantages, categorized by missing data patterns. To aid practitioners in method selection, we offer practical recommendations based on critical factors such as dataset size, data type, and missingness rate. By synthesizing insights from existing literature, we provide a holistic perspective on the effectiveness of various imputation methods under different biomedical contexts, thereby facilitating informed decision-making for researchers and practitioners in applying imputation techniques to biomedical data processing.
ISSN:2169-3536
DOI:10.1109/ACCESS.2024.3516506