Loading…

Establishing a software defect prediction model via effective dimension reduction

•The algorithm of local tangent space arrangement is applied to the field of software defect distribution prediction. LTSA is used to reduce the size of input data space and reduce the training and testing time of SVM. In addition, LTSA can also reduce the noise interference of data sets and improve...

Full description

Saved in:
Bibliographic Details
Published in:Information sciences 2019-03, Vol.477, p.399-409
Main Authors: Wei, Hua, Hu, Changzhen, Chen, Shiyou, Xue, Yuan, Zhang, Quanxin
Format: Article
Language:English
Subjects:
Citations: Items that this one cites
Items that cite this one
Online Access:Get full text
Tags: Add Tag
No Tags, Be the first to tag this record!
Description
Summary:•The algorithm of local tangent space arrangement is applied to the field of software defect distribution prediction. LTSA is used to reduce the size of input data space and reduce the training and testing time of SVM. In addition, LTSA can also reduce the noise interference of data sets and improve the mapping performance, which directly achieves the goal of improving software defect prediction index.•The data redundancy problem is solved. Firstly, we construct high-dimensional feature space of software defect data set. Then, in order to solve the data redundancy problem of software defect data set, we use local tangent space algorithm which is a non-linear manifold learning algorithm to extract the hidden low-dimensional submanifolds.•Selecting parameter adaptively. The grid search algorithm selects the parameters of the dimension and the neighborhood, and realizes the selection of the local neighborhood parameters in the high-dimensional space adaptively to guarantee the parameter optimality. With the continued growth of interoperable software developed for Internet of Things (IoT), there is a growing demand to predict software defect at various testing and operational phases. This paper solves the software defect prediction problem by proposing a novel model with the help of a local tangent space alignment support vector machine (LTSA-SVM) algorithm. The model employs the SVM algorithm as the basic classifier of software defect distribution prediction model. Then, the model parameters are optimized by combining a grid search method and ten-fold cross validation. In the traditional dimensionality reduction algorithms, data loss caused by the poor attributes of data nonlinearity reduces the accuracy of SVM. Aiming at this problem, this paper uses a LTSA algorithm to extract the intrinsic structure of low-dimensional data and performs effective dimension reduction. The SVM algorithm is trained by the reduced dimension data. Finally, the feasibility of the prediction model is verified. Compared with the single SVM and the LLE-SVM prediction algorithm, the prediction model in this paper improves the prediction accuracy and F-measure by 1–4%.
ISSN:0020-0255
1872-6291
DOI:10.1016/j.ins.2018.10.056