Loading…

Oversampling With Reliably Expanding Minority Class Regions for Imbalanced Data Learning

This paper proposes a simple interpolation Oversampling method with the purpose of Reliably Expanding the Minority class regions (OREM). OREM first finds the candidate minority region around each original minority sample, then exploits this region to further identify those clean subregions without d...

Full description

Saved in:
Bibliographic Details
Published in:IEEE transactions on knowledge and data engineering 2023-06, Vol.35 (6), p.6167-6181
Main Authors: Zhu, Tuanfei, Liu, Xinwang, Zhu, En
Format: Article
Language:English
Subjects:
Citations: Items that this one cites
Items that cite this one
Online Access:Get full text
Tags: Add Tag
No Tags, Be the first to tag this record!
Description
Summary:This paper proposes a simple interpolation Oversampling method with the purpose of Reliably Expanding the Minority class regions (OREM). OREM first finds the candidate minority region around each original minority sample, then exploits this region to further identify those clean subregions without distributing any majority sample. The synthetic samples are only allowed to generate in the clean subregions, so that the regions of the minority class can be broadened reliably. Given that the learning from multiclass imbalanced data is more challenging as compared to two-class scenarios, we also extend OREM to handle multiclass imbalance problems by leveraging an iteration procedure of generating synthetic samples, consequently leading to a multiclass oversampling algorithm OREM-M. The key peculiarity of OREM-M is to reduce the class overlapping not only between the synthetic minority and original samples, but also from the synthetic samples of different minority classes. In this way, OREM-M ensures that the data of each class after oversampling can be modeled well. In addition, we embed OREM into boosting framework to develop a new ensemble method OREMBoost addressing class imbalance problems. Extensive experiments demonstrate the effectiveness of the proposed OREM, OREM-M, and OREMBoost.
ISSN:1041-4347
1558-2191
DOI:10.1109/TKDE.2022.3171706