Loading…

Fuzzy-Based Information Decomposition for Incomplete and Imbalanced Data Learning

Class imbalance and missing values are two critical problems in pattern classification. Researchers have proposed a number of techniques to address each of the problems. However, no single technique can solve the two problems. Moreover, the simple combination approach cannot accurately classify the...

Full description

Saved in:
Bibliographic Details
Published in:IEEE transactions on fuzzy systems 2017-12, Vol.25 (6), p.1476-1490
Main Authors: Liu, Shigang, Zhang, Jun, Xiang, Yang, Zhou, Wanlei
Format: Article
Language:English
Subjects:
Citations: Items that this one cites
Items that cite this one
Online Access:Get full text
Tags: Add Tag
No Tags, Be the first to tag this record!
Description
Summary:Class imbalance and missing values are two critical problems in pattern classification. Researchers have proposed a number of techniques to address each of the problems. However, no single technique can solve the two problems. Moreover, the simple combination approach cannot accurately classify the imbalanced data with missing values. This paper develops a fuzzy-based information decomposition (FID) method to simultaneously address these two problems. In the new FID method, the two different problems are treated as the same missing data estimation problem. In particular, FID rebalances the training data by creating synthetic samples for the minority class. The proposed scheme has two steps: weighting and recovery. In the weighting step, the weights produced by the fuzzy membership functions are used to quantify the contribution of the observed data to the missing estimation. In the recovery step, missing values will be estimated by taking into account different contribution of the observed data. To evaluate the performance of the new FID method, a large number of classification experiments have been carried out on 27 well-known datasets. The results show that the FID method significantly outperforms other ten state-of-the-art individual methods and eight combination methods when missing values and imbalanced data present at the same time.
ISSN:1063-6706
1941-0034
DOI:10.1109/TFUZZ.2017.2754998