Loading…

FCM-CSMOTE: Fuzzy C-Means Center-SMOTE

Imbalanced class distributions in machine learning, where the minority class is often under-represented, pose a substantial challenge. Synthetic Minority Over-sampling Technique (SMOTE) has been widely employed to address this issue by generating synthetic minority samples through interpolation. Des...

Full description

Saved in:
Bibliographic Details
Published in:Expert systems with applications 2024-08, Vol.248, p.123406, Article 123406
Main Authors: Mohammed, Roudani, Karim, El Moutaouakil
Format: Article
Language:English
Subjects:
Citations: Items that this one cites
Online Access:Get full text
Tags: Add Tag
No Tags, Be the first to tag this record!
Description
Summary:Imbalanced class distributions in machine learning, where the minority class is often under-represented, pose a substantial challenge. Synthetic Minority Over-sampling Technique (SMOTE) has been widely employed to address this issue by generating synthetic minority samples through interpolation. Despite its popularity, SMOTE exhibits certain drawbacks caused by the implementation of random interpolation samples. In this paper, we introduce a new data level technique for oversampling, called Fuzzy C-Means Center-SMOTE (FCM-CSMOTE), which generates synthetic samples in each cluster using its center considered as the memory of the main data components. We demonstrate that the proposed selective strategy has a very low probability to generate noise. The experimental results demonstrate that the proposed method performs better than the state-of-the-art approaches on 21 real unbalanced data sets (regular and large size data set) in terms of several metrics, including Geometric Mean (GM), F-Measure (FM), Area Under the Curve (AUC), and Accuracy.
ISSN:0957-4174
1873-6793
DOI:10.1016/j.eswa.2024.123406