Loading…

PTLVD:Program Slicing and Transformer-based Line-level Vulnerability Detection System

In recent years, deep learning-based software vulnerability detection methods have made significant progress. However, most existing methods focus on detecting vulnerabilities at the function-level or slice-level and cannot pinpoint the exact lines of code that cause the vulnerabilities. Program sli...

Full description

Saved in:

Bibliographic Details
Main Authors:	Peng, Tao, Chen, Shixu, Zhu, Fei, Tang, Junwei, Liu, Junping, Hu, Xinrong
Format:	Conference Proceeding
Language:	English
Subjects:	Code gadgets Codes Image edge detection Interpretability Line-level Location awareness Neural networks Predictive models Program slicing Representation learning Source coding Transformer Vulnerability detection
Online Access:	Request full text
Tags:	Add Tag No Tags, Be the first to tag this record!

Description
Summary:	In recent years, deep learning-based software vulnerability detection methods have made significant progress. However, most existing methods focus on detecting vulnerabilities at the function-level or slice-level and cannot pinpoint the exact lines of code that cause the vulnerabilities. Program slicing can extract control and data dependency information from the code to assist deep learning models in detecting vulnerabilities. We propose a novel vulnerability detection model, PTLVD, which generates code gadgets(CGs) by slicing the program based on variables in the code, uses a transformer model for binary classification, and employs our proposed method Integrated Gradients Enhanced with Saliency(IGS) to locate the lines of code that are likely to cause vulnerabilities. IGS enhances the interpretability of the model by integrating the Integrated Gradients and Saliency methods. PTLVD employs an improved method of generating CGs to selectively remove irrelevant code statements, resulting in CGs that contain richer information and enhance the model's performance. Additionally, during the preprocessing stage, PTLVD removes comments and standardizes code statements onto the same line, which effectively enhances the performance and vulnerability localization capabilities of the model. Experimental results show that, compared to state-of-the-art function-level and slice-level vulnerability detection models, PTLVD improves precision and F1 by 5.25% and 1.79%, respectively. In line-level prediction, Compared to the baseline method, PTLVD not only improved the Top-5 Accuracy by 1.61%, but also successfully reduced the Mean First Ranking by 5.08%.
ISSN:	2470-6892
DOI:	10.1109/SCAM59687.2023.00026