Loading…

DynaQ: online learning from imbalanced multi-class streams through dynamic sampling

Online supervised learning from fast-evolving data streams, particularly in domains such as health, the environment, and manufacturing, is a crucial research area. However, these domains often experience class imbalance, which can skew class distributions. It is essential for online learning algorit...

Full description

Saved in:
Bibliographic Details
Published in:Applied intelligence (Dordrecht, Netherlands) Netherlands), 2023-11, Vol.53 (21), p.24908-24930
Main Authors: Sadeghi, Farnaz, Viktor, Herna L., Vafaie, Parsa
Format: Article
Language:English
Subjects:
Citations: Items that this one cites
Online Access:Get full text
Tags: Add Tag
No Tags, Be the first to tag this record!
Description
Summary:Online supervised learning from fast-evolving data streams, particularly in domains such as health, the environment, and manufacturing, is a crucial research area. However, these domains often experience class imbalance, which can skew class distributions. It is essential for online learning algorithms to analyze large datasets in real-time while accurately modeling rare or infrequent classes that may appear in bursts. While methods have been proposed to handle binary class imbalance, there is a lack of attention to multi-class imbalanced settings with varying degrees of imbalance in evolving streams. In this paper, we present the Dynamic Queues (DynaQ) algorithm for online learning in multi-class imbalanced settings to fill this knowledge gap. Our approach utilizes a batch-based resampling method that creates an instance queue for each class to balance the number of instances. We maintain a queue threshold and remove older samples during training. Additionally, we dynamically oversample minority classes based on one of four rate parameters: recall, F1-score, κ m , and Euclidean distance. Our learning algorithm consists of an ensemble that uses sliding windows and a soft voting schema while incorporating a drift detection mechanism. Our experimental results demonstrate the superiority of the DynaQ approach over state-of-the-art methods.
ISSN:0924-669X
1573-7497
DOI:10.1007/s10489-023-04886-w