Loading…
Minority oversampling for imbalanced ordinal regression
Ordinal regression naturally presents class imbalance distribution, because the samples of the boundary classes tend to have lower appearing probability than that of the other classes. As the most common solutions for class imbalance problems, the traditional oversampling algorithms can improve the...
Saved in:
Published in: | Knowledge-based systems 2019-02, Vol.166, p.140-155 |
---|---|
Main Authors: | , , , , |
Format: | Article |
Language: | English |
Subjects: | |
Citations: | Items that this one cites Items that cite this one |
Online Access: | Get full text |
Tags: |
Add Tag
No Tags, Be the first to tag this record!
|
Summary: | Ordinal regression naturally presents class imbalance distribution, because the samples of the boundary classes tend to have lower appearing probability than that of the other classes. As the most common solutions for class imbalance problems, the traditional oversampling algorithms can improve the classification of minority classes, but they result in the problem of over generalization at the same time, where the synthetic samples are created in incorrect regions. In the context of ordinal regression, over generalization can damage the ordering of space distribution of samples, thereby hamper the ordinal regression models to benefit from ordering information. In this paper, we propose a generation direction-aware Synthetic Minority oversampling technique to deal exclusively with imbalanced Ordinal Regression (SMOR). SMOR for each candidate generation direction computes a selection weight of being used to yield synthetic samples. By considering the ordering of the classes, the candidate generation directions, which may potentially distort ordinal sample structure, will tend to be assigned low selection weights. In this way, SMOR improves the ordering of minority classes without severely damaging the existing ordering of the problem. Extensive experiments with three ordinal regression classifiers show that our proposed method outperforms existing typical oversampling algorithms in terms of the Average of Mean Absolute Error (AMAE)and the Maximum Mean Absolute Error (MMAE).
•Finds a clear risk of over generalization when oversampling imbalanced ordinal regression.•Proposes a generation direction-aware synthetic minority oversampling algorithm.•Mechanism of protecting ordinal sample structure is established.•Superior performance than the state of the art over various ordinal regression classifier. |
---|---|
ISSN: | 0950-7051 1872-7409 |
DOI: | 10.1016/j.knosys.2018.12.021 |