Loading…

Data sampling approach using heuristic Learning Vector Quantization (LVQ) classifier for software defect prediction

On the basis of quality estimate, early prediction and identification of software flaws is crucial in the software area. Prediction of Software Defects SDP is defined as the process of exposing software to flaws through the use of prediction models and defect datasets. This study recommended a metho...

Full description

Saved in:
Bibliographic Details
Published in:Journal of intelligent & fuzzy systems 2023-01, Vol.44 (3), p.3867-3876
Main Authors: Amanullah, M., Thanga Ramya, S., Sudha, M., Gladis Pushparathi, V.P., Haldorai, Anandakumar, Pant, Bhaskar
Format: Article
Language:English
Subjects:
Citations: Items that this one cites
Items that cite this one
Online Access:Get full text
Tags: Add Tag
No Tags, Be the first to tag this record!
Description
Summary:On the basis of quality estimate, early prediction and identification of software flaws is crucial in the software area. Prediction of Software Defects SDP is defined as the process of exposing software to flaws through the use of prediction models and defect datasets. This study recommended a method for dealing with the class imbalance problem based on Improved Random Synthetic Minority Oversampling Technique (SMOTE), followed by Linear Pearson Correlation Technique to perform feature selection to predict software failure. On the basis of the SMOTE data sampling approach, a strategy for software defect prediction is given in this paper. To address the class imbalance, the defect datasets were initially processed using the Improved Random-SMOTE Oversampling technique. Then, using the Linear Pearson Correlation approach, the features were chosen, and using the k-fold cross validation process, the samples were split into training and testing datasets. Finally, Heuristic Learning Vector Quantization is used to classify data in order to predict software problems. Based on measures like sensitivity, specificity, FPR, and accuracy rate for two separate datasets, the performance of the proposed strategy is contrasted with the approaches to classification that presently exist.
ISSN:1064-1246
1875-8967
DOI:10.3233/JIFS-220480