Loading…
Fuzzy-Based Information Decomposition for Incomplete and Imbalanced Data Learning
Class imbalance and missing values are two critical problems in pattern classification. Researchers have proposed a number of techniques to address each of the problems. However, no single technique can solve the two problems. Moreover, the simple combination approach cannot accurately classify the...
Saved in:
Published in: | IEEE transactions on fuzzy systems 2017-12, Vol.25 (6), p.1476-1490 |
---|---|
Main Authors: | , , , |
Format: | Article |
Language: | English |
Subjects: | |
Citations: | Items that this one cites Items that cite this one |
Online Access: | Get full text |
Tags: |
Add Tag
No Tags, Be the first to tag this record!
|
Summary: | Class imbalance and missing values are two critical problems in pattern classification. Researchers have proposed a number of techniques to address each of the problems. However, no single technique can solve the two problems. Moreover, the simple combination approach cannot accurately classify the imbalanced data with missing values. This paper develops a fuzzy-based information decomposition (FID) method to simultaneously address these two problems. In the new FID method, the two different problems are treated as the same missing data estimation problem. In particular, FID rebalances the training data by creating synthetic samples for the minority class. The proposed scheme has two steps: weighting and recovery. In the weighting step, the weights produced by the fuzzy membership functions are used to quantify the contribution of the observed data to the missing estimation. In the recovery step, missing values will be estimated by taking into account different contribution of the observed data. To evaluate the performance of the new FID method, a large number of classification experiments have been carried out on 27 well-known datasets. The results show that the FID method significantly outperforms other ten state-of-the-art individual methods and eight combination methods when missing values and imbalanced data present at the same time. |
---|---|
ISSN: | 1063-6706 1941-0034 |
DOI: | 10.1109/TFUZZ.2017.2754998 |