Loading…

Learning to Preselection: A Filter-Based Performance Predictor for Multiobjective Feature Selection in Classification

Minimizing the classification error rate and the number of selected features are the two major objectives of feature selection, and they are often in conflict with each other, which is a multiobjective problem. Evolutionary algorithms have been widely used for multiobjective feature selection proble...

Full description

Saved in:
Bibliographic Details
Published in:IEEE transactions on evolutionary computation 2024, p.1-1
Main Authors: Jiao, Ruwang, Xue, Bing, Zhang, Mengjie
Format: Article
Language:English
Subjects:
Online Access:Request full text
Tags: Add Tag
No Tags, Be the first to tag this record!
Description
Summary:Minimizing the classification error rate and the number of selected features are the two major objectives of feature selection, and they are often in conflict with each other, which is a multiobjective problem. Evolutionary algorithms have been widely used for multiobjective feature selection problems. Preselection in evolutionary algorithms is used to improve the sampling quality by selecting only potentially promising candidate solutions for fitness evaluations. However, traditional preselection methods struggle to effectively handle feature selection due to its large-scale combinatorial nature and intricate feature interactions. To alleviate this issue, this paper proposes a filter-based performance predictor to preselect feature subsets for subsequent classification fitness evaluations. It uses multiple filter measures to estimate the classification performance of a feature subset, which can explore complex feature interactions and is also insensitive to the dimensionality. Additionally, a correlation coefficient is used to measure the compatibility between the learned performance predictor and the classification performance. Based on the degree of compatibility, a preselection method that considers both the predicted classification performance and the feature subset diversity is proposed, which can preselect promising solutions from multiple candidate solutions and thus improve the feature subset search efficiency. The proposed method is verified experimentally on a total of 18 classification datasets spanning various domains, and the results reveal that it can find feature subsets with better classification performance and converge faster to competitive results compared to state-of-the-art methods.
ISSN:1089-778X
1941-0026
DOI:10.1109/TEVC.2024.3373802