Loading…
Feature Selection with a Binary Flamingo Search Algorithm and a Genetic Algorithm
In data mining, feature selection (FS) has become a significant data pre-processing tool that maximises the model's generalisation and minimises the feature size. Due to the large search area, the classical optimization techniques repeatedly fail to construct global optimization. Several hybrid...
Saved in:
Published in: | Multimedia tools and applications 2023-07, Vol.82 (17), p.26679-26730 |
---|---|
Main Authors: | , |
Format: | Article |
Language: | English |
Subjects: | |
Citations: | Items that this one cites Items that cite this one |
Online Access: | Get full text |
Tags: |
Add Tag
No Tags, Be the first to tag this record!
|
Summary: | In data mining, feature selection (FS) has become a significant data pre-processing tool that maximises the model's generalisation and minimises the feature size. Due to the large search area, the classical optimization techniques repeatedly fail to construct global optimization. Several hybrid models integrating various search policies have recently been offered; however, they mostly deal with low dimensional datasets. This paper proposes a hybrid version of binary flamingo search with a genetic algorithm (HBFS-GA) to overcome the FS problem using a wrapper model. A genetic algorithm (GA) and a flamingo search algorithm (FSA) are combined in the proposed work. HBFS-GA executes on continuous search, but the FS is in discrete space. By utilizing transfer functions (TFs), the continuous search has been transformed into a discrete one. To determine the best TF and investigate HBFS-GA, the proposed model used eight distinct TFs. Following that, the performance of the proposed HBFS-GA is evaluated using 18 different UCI datasets and many metrics. The optimal variation is chosen, and the performance of existing wrapper-based and filter-based FS models is investigated. The existing wrapper-based models include BPSO, BGA, BACO, BCS, BGWO, BBAT, BGEO, and BCSO. Some filter-based methods include a gain ratio and correlation bases FS, information gain and relief, respectively. Besides, the proposed HBFS-GA is evaluated using 30 functions from CEC’2019 and CEC’2020 benchmarks. Consequently, the proposed HBFS-GA has accomplished better outcomes than existing models. While analysing the classification accuracy with eighteen datasets, the lung cancer dataset obtains higher accuracy of 99.51% with less computation time of 0.031s. |
---|---|
ISSN: | 1380-7501 1573-7721 |
DOI: | 10.1007/s11042-023-15467-x |