Loading…

Feature Selection with a Binary Flamingo Search Algorithm and a Genetic Algorithm

In data mining, feature selection (FS) has become a significant data pre-processing tool that maximises the model's generalisation and minimises the feature size. Due to the large search area, the classical optimization techniques repeatedly fail to construct global optimization. Several hybrid...

Full description

Saved in:
Bibliographic Details
Published in:Multimedia tools and applications 2023-07, Vol.82 (17), p.26679-26730
Main Authors: Eluri, Rama Krishna, Devarakonda, Nagaraju
Format: Article
Language:English
Subjects:
Citations: Items that this one cites
Items that cite this one
Online Access:Get full text
Tags: Add Tag
No Tags, Be the first to tag this record!
Description
Summary:In data mining, feature selection (FS) has become a significant data pre-processing tool that maximises the model's generalisation and minimises the feature size. Due to the large search area, the classical optimization techniques repeatedly fail to construct global optimization. Several hybrid models integrating various search policies have recently been offered; however, they mostly deal with low dimensional datasets. This paper proposes a hybrid version of binary flamingo search with a genetic algorithm (HBFS-GA) to overcome the FS problem using a wrapper model. A genetic algorithm (GA) and a flamingo search algorithm (FSA) are combined in the proposed work. HBFS-GA executes on continuous search, but the FS is in discrete space. By utilizing transfer functions (TFs), the continuous search has been transformed into a discrete one. To determine the best TF and investigate HBFS-GA, the proposed model used eight distinct TFs. Following that, the performance of the proposed HBFS-GA is evaluated using 18 different UCI datasets and many metrics. The optimal variation is chosen, and the performance of existing wrapper-based and filter-based FS models is investigated. The existing wrapper-based models include BPSO, BGA, BACO, BCS, BGWO, BBAT, BGEO, and BCSO. Some filter-based methods include a gain ratio and correlation bases FS, information gain and relief, respectively. Besides, the proposed HBFS-GA is evaluated using 30 functions from CEC’2019 and CEC’2020 benchmarks. Consequently, the proposed HBFS-GA has accomplished better outcomes than existing models. While analysing the classification accuracy with eighteen datasets, the lung cancer dataset obtains higher accuracy of 99.51% with less computation time of 0.031s.
ISSN:1380-7501
1573-7721
DOI:10.1007/s11042-023-15467-x