Loading…

Minority oversampling for imbalanced ordinal regression

Ordinal regression naturally presents class imbalance distribution, because the samples of the boundary classes tend to have lower appearing probability than that of the other classes. As the most common solutions for class imbalance problems, the traditional oversampling algorithms can improve the...

Full description

Saved in:
Bibliographic Details
Published in:Knowledge-based systems 2019-02, Vol.166, p.140-155
Main Authors: Zhu, Tuanfei, Lin, Yaping, Liu, Yonghe, Zhang, Wei, Zhang, Jianming
Format: Article
Language:English
Subjects:
Citations: Items that this one cites
Items that cite this one
Online Access:Get full text
Tags: Add Tag
No Tags, Be the first to tag this record!
Description
Summary:Ordinal regression naturally presents class imbalance distribution, because the samples of the boundary classes tend to have lower appearing probability than that of the other classes. As the most common solutions for class imbalance problems, the traditional oversampling algorithms can improve the classification of minority classes, but they result in the problem of over generalization at the same time, where the synthetic samples are created in incorrect regions. In the context of ordinal regression, over generalization can damage the ordering of space distribution of samples, thereby hamper the ordinal regression models to benefit from ordering information. In this paper, we propose a generation direction-aware Synthetic Minority oversampling technique to deal exclusively with imbalanced Ordinal Regression (SMOR). SMOR for each candidate generation direction computes a selection weight of being used to yield synthetic samples. By considering the ordering of the classes, the candidate generation directions, which may potentially distort ordinal sample structure, will tend to be assigned low selection weights. In this way, SMOR improves the ordering of minority classes without severely damaging the existing ordering of the problem. Extensive experiments with three ordinal regression classifiers show that our proposed method outperforms existing typical oversampling algorithms in terms of the Average of Mean Absolute Error (AMAE)and the Maximum Mean Absolute Error (MMAE). •Finds a clear risk of over generalization when oversampling imbalanced ordinal regression.•Proposes a generation direction-aware synthetic minority oversampling algorithm.•Mechanism of protecting ordinal sample structure is established.•Superior performance than the state of the art over various ordinal regression classifier.
ISSN:0950-7051
1872-7409
DOI:10.1016/j.knosys.2018.12.021