Loading…

Rotation Forest: A New Classifier Ensemble Method

We propose a method for generating classifier ensembles based on feature extraction. To create the training data for a base classifier, the feature set is randomly split into K subsets (K is a parameter of the algorithm) and principal component analysis (PCA) is applied to each subset. All principal...

Full description

Saved in:

Bibliographic Details
Published in:	IEEE transactions on pattern analysis and machine intelligence 2006-10, Vol.28 (10), p.1619-1630
Main Authors:	Rodriguez, J.J., Kuncheva, L.I., Alonso, C.J.
Format:	Article
Language:	English
Subjects:	Accuracy AdaBoost Algorithms Applied sciences Artificial Intelligence Bagging Classification tree analysis Classifier ensembles Classifiers Cluster Analysis Computer science control theory systems Computer Simulation Computer Society Decision trees Exact sciences and technology Feature extraction Forests Information Storage and Retrieval - methods kappa-error diagrams Machine learning Mathematical models Models, Statistical Numerical Analysis, Computer-Assisted Pattern recognition Pattern Recognition, Automated - methods Pattern recognition. Digital image processing. Computational geometry PCA Preserves Principal Component Analysis Principal components analysis random forest Reproducibility of Results Sensitivity and Specificity Studies Training data Voting
Citations:	Items that this one cites Items that cite this one
Online Access:	Get full text
Tags:	Add Tag No Tags, Be the first to tag this record!

Description
Summary:	We propose a method for generating classifier ensembles based on feature extraction. To create the training data for a base classifier, the feature set is randomly split into K subsets (K is a parameter of the algorithm) and principal component analysis (PCA) is applied to each subset. All principal components are retained in order to preserve the variability information in the data. Thus, K axis rotations take place to form the new features for a base classifier. The idea of the rotation approach is to encourage simultaneously individual accuracy and diversity within the ensemble. Diversity is promoted through the feature extraction for each base classifier. Decision trees were chosen here because they are sensitive to rotation of the feature axes, hence the name "forest". Accuracy is sought by keeping all principal components and also using the whole data set to train each base classifier. Using WEKA, we examined the rotation forest ensemble on a random selection of 33 benchmark data sets from the UCI repository and compared it with bagging, AdaBoost, and random forest. The results were favorable to rotation forest and prompted an investigation into diversity-accuracy landscape of the ensemble models. Diversity-error diagrams revealed that rotation forest ensembles construct individual classifiers which are more accurate than these in AdaBoost and random forest, and more diverse than these in bagging, sometimes more accurate as well
ISSN:	0162-8828 1939-3539 2160-9292
DOI:	10.1109/TPAMI.2006.211