Loading…

RadarTSR: A new algorithm for cellwise and rowwise outlier detection and missing data imputation

High-dimensional and multivariate data sets often contain missing data and/or cellwise/rowwise outliers. Whereas several solutions have been proposed to deal with each one of these issues independently, the number of suitable techniques that simultaneously confront these phenomena is drastically red...

Full description

Saved in:
Bibliographic Details
Published in:Chemometrics and intelligent laboratory systems 2024-04, Vol.247, p.105047, Article 105047
Main Authors: González-Cebrián, Alba, Folch-Fortuny, Abel, Arteaga, Francisco, Ferrer, Alberto
Format: Article
Language:English
Subjects:
Citations: Items that this one cites
Online Access:Get full text
Tags: Add Tag
No Tags, Be the first to tag this record!
Description
Summary:High-dimensional and multivariate data sets often contain missing data and/or cellwise/rowwise outliers. Whereas several solutions have been proposed to deal with each one of these issues independently, the number of suitable techniques that simultaneously confront these phenomena is drastically reduced. In this paper, we introduce RadarTSR, a Robust Adaptation for Data with Anomalous Rows and/or cells of the Trimmed Scores Regression method, which is based on a Principal Component Analysis (PCA). RadarTSR detects cellwise and rowwise outliers, imputes missing data without the harmful effect of outliers, and, if grouped rowwise outliers are detected, RadarTSR imputes them with their own model. The performance of RadarTSR is compared to the MacroPCA algorithm; as far as we are concerned, the only proposal that deals with missing data and contemplates these two different types of outliers. Several simulated and real data sets are used. The RadarTSR code is available in Matlab. •RadarTSR is a new tool for PCA-MB with missing data, cellwise and rowwise outliers.•The algorithm also searches for potential clusters among rowwise outliers.•RadarTSR rivals its main competitor, with relevant differences in chemometrics cases.
ISSN:0169-7439
1873-3239
DOI:10.1016/j.chemolab.2023.105047