Loading…
A many objective based feature selection model for software defect prediction
Summary Given the escalating magnitude and intricacy of software systems, software measurement data often contains irrelevant and redundant features, resulting in significant resource and storage requirements for software defect prediction (SDP). Feature selection (FS) has a vital impact on the init...
Saved in:
Published in: | Concurrency and computation 2024-08, Vol.36 (19), p.n/a |
---|---|
Main Authors: | , , , |
Format: | Article |
Language: | English |
Subjects: | |
Citations: | Items that this one cites Items that cite this one |
Online Access: | Get full text |
Tags: |
Add Tag
No Tags, Be the first to tag this record!
|
Summary: | Summary
Given the escalating magnitude and intricacy of software systems, software measurement data often contains irrelevant and redundant features, resulting in significant resource and storage requirements for software defect prediction (SDP). Feature selection (FS) has a vital impact on the initial data preparation phase of SDP. Nonetheless, existing FS methods suffer from issues such as insignificant dimensionality reduction, low accuracy in classifying chosen optimal feature sets, and neglect of complex interactions and dependencies between defect data and features as well as between features and classes. To tackle the aforementioned problems, this paper proposes a many‐objective SDPFS (MOSDPFS) model and the binary many‐objective PSO algorithm with adaptive enhanced selection strategy (BMaOPSO‐AR2) is proposed within this paper. MOSDPFS selects F1 score, the number of features within subsets, and correlation and redundancy measures based on mutual information (MI) as optimization objectives. BMaOPSO‐AR2 constructs a binary version of MaOPSO using transfer functions specifically for binary classification. Adaptive update formulas and the introduction of the R2 indicator are employed to augment the variety and convergence of algorithm. Additionally, performance of MOSDPFS and BMaOPSO‐AR2 are tested on the NASA‐MDP and PROMISE datasets. Numerical results prove that a proposed model and algorithm effectively reduces feature count while enhancing predictive accuracy and minimizing model complexity. |
---|---|
ISSN: | 1532-0626 1532-0634 |
DOI: | 10.1002/cpe.8153 |