Loading…
FCM-CSMOTE: Fuzzy C-Means Center-SMOTE
Imbalanced class distributions in machine learning, where the minority class is often under-represented, pose a substantial challenge. Synthetic Minority Over-sampling Technique (SMOTE) has been widely employed to address this issue by generating synthetic minority samples through interpolation. Des...
Saved in:
Published in: | Expert systems with applications 2024-08, Vol.248, p.123406, Article 123406 |
---|---|
Main Authors: | , |
Format: | Article |
Language: | English |
Subjects: | |
Citations: | Items that this one cites |
Online Access: | Get full text |
Tags: |
Add Tag
No Tags, Be the first to tag this record!
|
Summary: | Imbalanced class distributions in machine learning, where the minority class is often under-represented, pose a substantial challenge. Synthetic Minority Over-sampling Technique (SMOTE) has been widely employed to address this issue by generating synthetic minority samples through interpolation. Despite its popularity, SMOTE exhibits certain drawbacks caused by the implementation of random interpolation samples. In this paper, we introduce a new data level technique for oversampling, called Fuzzy C-Means Center-SMOTE (FCM-CSMOTE), which generates synthetic samples in each cluster using its center considered as the memory of the main data components. We demonstrate that the proposed selective strategy has a very low probability to generate noise. The experimental results demonstrate that the proposed method performs better than the state-of-the-art approaches on 21 real unbalanced data sets (regular and large size data set) in terms of several metrics, including Geometric Mean (GM), F-Measure (FM), Area Under the Curve (AUC), and Accuracy. |
---|---|
ISSN: | 0957-4174 1873-6793 |
DOI: | 10.1016/j.eswa.2024.123406 |