Loading…

A Cross-Project Defect Prediction Model Based on Deep Learning With Self-Attention

Cross-project defect prediction technique is a hot topic in the field of software defect research because of the huge difference in data distribution between source project and target project. Software defect prediction technique usually first extracts software project features and then trains predi...

Full description

Saved in:

Bibliographic Details
Published in:	IEEE access 2022, Vol.10, p.110385-110401
Main Authors:	Wen, Wanzhi, Zhang, Ruinian, Wang, Chuyue, Shen, Chenqiang, Yu, Meng, Zhang, Suchuan, Gao, Xinxin
Format:	Article
Language:	English
Subjects:	Algorithms Alliances Codes Deep learning Defect prediction Defects Feature extraction Logic gates long and short-term memory Long short term memory Machine learning Performance enhancement Performance evaluation Prediction models Predictive models self-attention mechanism Semantics Software Source code Syntactics
Citations:	Items that this one cites Items that cite this one
Online Access:	Get full text
Tags:	Add Tag No Tags, Be the first to tag this record!

Description
Summary:	Cross-project defect prediction technique is a hot topic in the field of software defect research because of the huge difference in data distribution between source project and target project. Software defect prediction technique usually first extracts software project features and then trains prediction models based on various classifiers. However, traditional features lack sufficient semantic information of source code resulting in poor performance of the prediction models. To construct more accurate prediction models based on the semantic information, we propose a cross-project defect prediction framework named BSLDP, which extracts semantic information of source code files through a bidirectional long and short-term memory network with self-attention mechanism. In particular, we provide semantic extractor named ALC to extract source code semantics based on source code files, and propose classification algorithm based on the semantic information of source project and target project, namely BSL, to build a prediction model. Furthermore, we propose an equal meshing mechanism that ALC generates semantic information on small fragments by dividing the numerical token vector to further improve the performance of the proposed model. We evaluated the performance of the proposed model on a publicly available PROMISE dataset. Compared with the four state-of-the-art methods, the experimental results indicate that on average BSLDP improves the performance of cross-project defect prediction in terms of F1 by 14.2%, 34.6%, 32.2% and 23.6%, respectively.
ISSN:	2169-3536 2169-3536
DOI:	10.1109/ACCESS.2022.3214536