Loading…
Search space division method for wrapper feature selection on high-dimensional data classification
Feature selection (FS) is an essential pre-processing technique for high-dimensional data. Wrapper-based FS techniques are known for their superior performance over filter FS. However, when the dimensionality of data is very high the wrapper techniques become computationally very expensive. To solve...
Saved in:
Published in: | Knowledge-based systems 2024-05, Vol.291, p.111578, Article 111578 |
---|---|
Main Author: | |
Format: | Article |
Language: | English |
Subjects: | |
Citations: | Items that this one cites |
Online Access: | Get full text |
Tags: |
Add Tag
No Tags, Be the first to tag this record!
|
Summary: | Feature selection (FS) is an essential pre-processing technique for high-dimensional data. Wrapper-based FS techniques are known for their superior performance over filter FS. However, when the dimensionality of data is very high the wrapper techniques become computationally very expensive. To solve this problem of scalability, this paper proposes the concept of search space division (SSD) which leads to smaller search spaces and hence reduced computational cost. The proposed SSD approach is generic in nature and can be integrated with any wrapper-based FS technique. The SSD approach divides search space into multiple parts and the wrapper-based FS is independently applied to each part. To facilitate the interaction of features over the complete search space, all feature subsets obtained from each part are combined and the wrapper-based FS is again applied to get the final feature subset. Moreover, a new wrapper FS technique named Binary Rao (BRAO) Algorithm has been proposed. BRAO is based on the metaphor-less, parameter-less meta-heuristic optimization algorithm Rao-1. The proposed combination of SSD and BRAO algorithm named SSDR has produced better classification accuracies with a smaller number of features in a much shorter time as compared to other well-regarded wrapper FS techniques as well as classical FS techniques over thirteen benchmark high-dimensional datasets. |
---|---|
ISSN: | 0950-7051 1872-7409 |
DOI: | 10.1016/j.knosys.2024.111578 |