Loading…

Boosting imbalanced data learning with Wiener process oversampling

Learning from imbalanced data is a challenging task in a wide range of applications, which attracts significant research efforts from machine learning and data mining community. As a natural approach to this issue, oversampling balances the training samples through replicating existing samples or sy...

Full description

Saved in:

Bibliographic Details
Published in:	Frontiers of Computer Science 2017-10, Vol.11 (5), p.836-851
Main Authors:	LI, Qian, LI, Gang, NIU, Wenjia, CAO, Yanan, CHANG, Liang, TAN, Jianlong, GUO, Li
Format:	Article
Language:	English
Subjects:	AdaBoost Algorithms Computer Science Data mining ensemble learning imbalanced-data learning Machine learning Normal distribution Oversampling Research Article Synthesis Wiener process Wiener过程不平衡数据信息需要机器学习正态分布维纳过程训练样本过采样
Citations:	Items that this one cites Items that cite this one
Online Access:	Get full text
Tags:	Add Tag No Tags, Be the first to tag this record!

Description
Summary:	Learning from imbalanced data is a challenging task in a wide range of applications, which attracts significant research efforts from machine learning and data mining community. As a natural approach to this issue, oversampling balances the training samples through replicating existing samples or synthesizing new samples. In general, synthesization outperforms replication by supplying additional information on the minority class. However, the additional information needs to follow the same normal distribution of the training set, which further constrains the new samples within the predefined range of training set. In this paper, we present the Wiener process oversampling （WPO） technique that brings the physics phenomena into sample synthesization. WPO constructs a robust decision region by expanding the attribute ranges in training set while keeping the same normal distribution. The satisfactory performance of WPO can be achieved with much lower computing complexity. In addition, by integrating WPO with ensemble learning, the WPOBoost algorithm outperforms many prevalent imbalance learning solutions.
ISSN:	2095-2228 2095-2236
DOI:	10.1007/s11704-016-5250-y