Loading…
SLDeep: Statement-level software defect prediction using deep-learning model on static code features
•We propose a suite of 32 statement-level metrics.•We use long short-term memory (LSTM) as learning model.•We have experimented on more than 100,000 C/C++ programs.•We have achieved a recall of about 96% in the experiments. Software defect prediction (SDP) seeks to estimate fault-prone areas of the...
Saved in:
Published in: | Expert systems with applications 2020-06, Vol.147, p.113156, Article 113156 |
---|---|
Main Authors: | , , , , |
Format: | Article |
Language: | English |
Subjects: | |
Citations: | Items that this one cites Items that cite this one |
Online Access: | Get full text |
Tags: |
Add Tag
No Tags, Be the first to tag this record!
|
Summary: | •We propose a suite of 32 statement-level metrics.•We use long short-term memory (LSTM) as learning model.•We have experimented on more than 100,000 C/C++ programs.•We have achieved a recall of about 96% in the experiments.
Software defect prediction (SDP) seeks to estimate fault-prone areas of the code to focus testing activities on more suspicious portions. Consequently, high-quality software is released with less time and effort. The current SDP techniques however work at coarse-grained units, such as a module or a class, putting some burden on the developers to locate the fault. To address this issue, we propose a new technique called as Statement-Level software defect prediction using Deep-learning model (SLDeep). The significance of SLDeep for intelligent and expert systems is that it demonstrates a novel use of deep-learning models to the solution of a practical problem faced by software developers. To reify our proposal, we defined a suite of 32 statement-level metrics, such as the number of binary and unary operators used in a statement. Then, we applied as learning model, long short-term memory (LSTM). We conducted experiments using 119,989 C/C++ programs within Code4Bench. The programs comprise 2,356,458 lines of code of which 292,064 lines are faulty. The benchmark comprises a diverse set of programs and versions, written by thousands of developers. Therefore, it tends to give a model that can be used for cross-project SDP. In the experiments, our trained model could successfully classify the unseen data (that is, fault-proneness of new statements) with average performance measures 0.979, 0.570, and 0.702 in terms of recall, precision, and accuracy, respectively. These experimental results suggest that SLDeep is effective for statement-level SDP. The impact of this work is twofold. Working at statement-level further alleviates developer's burden in pinpointing the fault locations. Second, cross-project feature of SLDeep helps defect prediction research become more industrially-viable. |
---|---|
ISSN: | 0957-4174 1873-6793 |
DOI: | 10.1016/j.eswa.2019.113156 |