Loading…

Dynamic Synthetic Minority Over-Sampling Technique-Based Rotation Forest for the Classification of Imbalanced Hyperspectral Data

Rotation forest (RoF) is a powerful ensemble classifier and has attracted substantial attention due to its performance in hyperspectral data classification. Multi-class imbalance learning is one of the biggest challenges in machine learning and remote sensing. The standard technique for constructing...

Full description

Saved in:
Bibliographic Details
Published in:IEEE journal of selected topics in applied earth observations and remote sensing 2019-07, Vol.12 (7), p.2159-2169
Main Authors: Feng, Wei, Dauphin, Gabriel, Huang, Wenjiang, Quan, Yinghui, Bao, Wenxing, Wu, Mingquan, Li, Qiang
Format: Article
Language:English
Subjects:
Citations: Items that this one cites
Items that cite this one
Online Access:Get full text
Tags: Add Tag
No Tags, Be the first to tag this record!
Description
Summary:Rotation forest (RoF) is a powerful ensemble classifier and has attracted substantial attention due to its performance in hyperspectral data classification. Multi-class imbalance learning is one of the biggest challenges in machine learning and remote sensing. The standard technique for constructing RoF ensemble tends to increase the overall accuracy; RoF has difficulty to sufficiently recognize the minority class. This paper proposes a novel dynamic SMOTE (synthetic minority oversampling technique)-based RoF algorithm for the multi-class imbalance problem. The main idea of the proposed method is to dynamically balance the class distribution before building each rotation decision tree. A resampling rate is set in each iteration (ranging from 10% in the first iteration to 100% in the last) and this ratio defines the number of minority class instances randomly resampled (with replacement) from the original dataset in each iteration. The rest of the minority class instances are generated by the SMOTE method. The reported results on three real hyperspectral datasets show that the proposed method can get better performance than random forest, RoF, and some popular data sampling methods.
ISSN:1939-1404
2151-1535
DOI:10.1109/JSTARS.2019.2922297