Loading…
An empirical study on the joint impact of feature selection and data resampling on imbalance classification
Many real-world datasets exhibit imbalanced distributions, in which the majority classes have sufficient samples, whereas the minority classes often have a very small number of samples. Data resampling has proven to be effective in alleviating such imbalanced settings, while feature selection is a c...
Saved in:
Published in: | Applied intelligence (Dordrecht, Netherlands) Netherlands), 2023-03, Vol.53 (5), p.5449-5461 |
---|---|
Main Authors: | , , , , , , |
Format: | Article |
Language: | English |
Subjects: | |
Citations: | Items that this one cites Items that cite this one |
Online Access: | Get full text |
Tags: |
Add Tag
No Tags, Be the first to tag this record!
|
Summary: | Many real-world datasets exhibit imbalanced distributions, in which the majority classes have sufficient samples, whereas the minority classes often have a very small number of samples. Data resampling has proven to be effective in alleviating such imbalanced settings, while feature selection is a commonly used technique for improving classification performance. However, the joint impact of feature selection and data resampling on two-class imbalance classification has rarely been addressed before. This work investigates the performance of two opposite imbalanced classification frameworks in which feature selection is applied before or after data resampling. We conduct a large-scale empirical study with a total of
9225
experiments on
52
publicly available datasets. The results show that both frameworks should be considered for finding the best performing imbalanced classification model. We also study the impact of classifiers, the ratio between the number of majority and minority samples (IR), and the ratio between the number of samples and features (SFR) on the performance of imbalance classification. Overall, this work provides a new reference value for researchers and practitioners in imbalance learning. |
---|---|
ISSN: | 0924-669X 1573-7497 1573-7497 |
DOI: | 10.1007/s10489-022-03772-1 |