Loading…

A many objective based feature selection model for software defect prediction

Summary Given the escalating magnitude and intricacy of software systems, software measurement data often contains irrelevant and redundant features, resulting in significant resource and storage requirements for software defect prediction (SDP). Feature selection (FS) has a vital impact on the init...

Full description

Saved in:
Bibliographic Details
Published in:Concurrency and computation 2024-08, Vol.36 (19), p.n/a
Main Authors: Mao, Qi, Zhang, Jingbo, Zhao, Tianhao, Cai, Xingjuan
Format: Article
Language:English
Subjects:
Citations: Items that this one cites
Items that cite this one
Online Access:Get full text
Tags: Add Tag
No Tags, Be the first to tag this record!
Description
Summary:Summary Given the escalating magnitude and intricacy of software systems, software measurement data often contains irrelevant and redundant features, resulting in significant resource and storage requirements for software defect prediction (SDP). Feature selection (FS) has a vital impact on the initial data preparation phase of SDP. Nonetheless, existing FS methods suffer from issues such as insignificant dimensionality reduction, low accuracy in classifying chosen optimal feature sets, and neglect of complex interactions and dependencies between defect data and features as well as between features and classes. To tackle the aforementioned problems, this paper proposes a many‐objective SDPFS (MOSDPFS) model and the binary many‐objective PSO algorithm with adaptive enhanced selection strategy (BMaOPSO‐AR2) is proposed within this paper. MOSDPFS selects F1 score, the number of features within subsets, and correlation and redundancy measures based on mutual information (MI) as optimization objectives. BMaOPSO‐AR2 constructs a binary version of MaOPSO using transfer functions specifically for binary classification. Adaptive update formulas and the introduction of the R2 indicator are employed to augment the variety and convergence of algorithm. Additionally, performance of MOSDPFS and BMaOPSO‐AR2 are tested on the NASA‐MDP and PROMISE datasets. Numerical results prove that a proposed model and algorithm effectively reduces feature count while enhancing predictive accuracy and minimizing model complexity.
ISSN:1532-0626
1532-0634
DOI:10.1002/cpe.8153