Loading…
Ensemble feature selection: Homogeneous and heterogeneous approaches
•Over the last years, ensemble learning has been the focus of much attention.•We apply two different designs of ensemble learning on the feature selection process.•Homogeneous ensemble distributes the dataset on different nodes.•Heterogeneous ensemble combines the result of different feature selecti...
Saved in:
Published in: | Knowledge-based systems 2017-02, Vol.118, p.124-139 |
---|---|
Main Authors: | , , , |
Format: | Article |
Language: | English |
Subjects: | |
Citations: | Items that this one cites Items that cite this one |
Online Access: | Get full text |
Tags: |
Add Tag
No Tags, Be the first to tag this record!
|
Summary: | •Over the last years, ensemble learning has been the focus of much attention.•We apply two different designs of ensemble learning on the feature selection process.•Homogeneous ensemble distributes the dataset on different nodes.•Heterogeneous ensemble combines the result of different feature selection methods.•We reduce the training time and release the user to choose a feature selection method.
In the last decade, ensemble learning has become a prolific discipline in pattern recognition, based on the assumption that the combination of the output of several models obtains better results than the output of any individual model. On the basis that the same principle can be applied to feature selection, we describe two approaches: (i) homogeneous, i.e., using the same feature selection method with different training data and distributing the dataset over several nodes; and (ii) heterogeneous, i.e., using different feature selection methods with the same training data. Both approaches are based on combining rankings of features that contain all the ordered features. The results of the base selectors are combined using different combination methods, also called aggregators, and a practical subset is selected according to several different threshold values (traditional values based on fixed percentages, and more novel automatic methods based on data complexity measures). In testing using a Support Vector Machine as a classifier, ensemble results for seven datasets demonstrate performance that is at least comparable and often better than the performance of individual feature selection methods. |
---|---|
ISSN: | 0950-7051 1872-7409 |
DOI: | 10.1016/j.knosys.2016.11.017 |