Loading…

Set-based integer-coded fuzzy granular evolutionary algorithms for high-dimensional feature selection

Feature selection plays a pivotal role in handling today’s high-dimensional databases by keeping only the most valuable features, leading to less computation, improved performance, and higher transparency in decision-making processes. Despite the considerable advances in combinatorial optimization,...

Full description

Saved in:
Bibliographic Details
Published in:Applied soft computing 2023-07, Vol.142, p.110240, Article 110240
Main Authors: Saadatmand, H., Akbarzadeh-T, M.-R.
Format: Article
Language:English
Subjects:
Citations: Items that this one cites
Items that cite this one
Online Access:Get full text
Tags: Add Tag
No Tags, Be the first to tag this record!
Description
Summary:Feature selection plays a pivotal role in handling today’s high-dimensional databases by keeping only the most valuable features, leading to less computation, improved performance, and higher transparency in decision-making processes. Despite the considerable advances in combinatorial optimization, this data-preprocessing step is computationally NP-hard and continues to pose critical challenges, particularly for very high-dimensional (VHD) databases. Here, we propose integer coding and fuzzy granulation (FG) as an integral part of evolutionary wrapper-based feature selection. Based on this integer coding, we further propose crossover and mutation operators that employ set operations such as ‘union,’ ‘intersection,’ and ‘complement’ for higher transparency in their evolutionary explorative and exploitative search processes. In addition to its common use as a surrogate technique to avoid unnecessary computations by recognizing similarities, the fuzzy granulation concept also operates as a repulsive strategy that searches for dissimilarities in the elitist and population initialization routines to reach higher population diversity. An ablation study is implemented to discover the role of individual components of this multi-prong approach. The results are then compared on 22 benchmark problems, ranging from 64 to 138672 attributes, with nine competing methods. Superior performance is shown for the proposed approach in terms of accuracy (in 15 of 22 cases) and achieving a substantially smaller (as much as six times less) feature set with considerably less computational cost (by an average of 30 percent), particularly for VHD feature selection •Very high-dimensional evolutionary feature selection uses integer coding.•Union, intersection, and complement operators perform crossover and mutation.•Fuzzy granulation acts as a diversity-preserving initialization and elitism strategies.•Fuzzy surrogate granulation is also employed to reduce fitness evaluations.•Up to 30 percent fewer computations for problems with more than 1000 features.
ISSN:1568-4946
1872-9681
DOI:10.1016/j.asoc.2023.110240