Loading…

Apply transfer learning to cybersecurity: Predicting exploitability of vulnerabilities by description

Thousands of software vulnerabilities are archived and disclosed to the public each year, posing severe cybersecurity threats to the whole society. Predicting the exploitability of vulnerabilities is crucial for decision-makers to prioritize their efforts and patch the most critical vulnerabilities....

Full description

Saved in:

Bibliographic Details
Published in:	Knowledge-based systems 2020-12, Vol.210, p.106529, Article 106529
Main Authors:	Yin, Jiao, Tang, MingJian, Cao, Jinli, Wang, Hua
Format:	Article
Language:	English
Subjects:	Coders Cybersecurity Decision making Descriptions ExBERT Exploitability prediction Feature extraction Natural language processing Semantics Software Software reliability Software vulnerability Transfer learning
Citations:	Items that this one cites Items that cite this one
Online Access:	Get full text
Tags:	Add Tag No Tags, Be the first to tag this record!

Description
Summary:	Thousands of software vulnerabilities are archived and disclosed to the public each year, posing severe cybersecurity threats to the whole society. Predicting the exploitability of vulnerabilities is crucial for decision-makers to prioritize their efforts and patch the most critical vulnerabilities. Software vulnerability descriptions are accessible features in early stage and contain rich semantic information. Therefore, descriptions are wildly used for exploitability prediction in both industry and academia. However, comparing with other corpora, the size of vulnerability description corpus is too small to train a comprehensive Natural Language Processing (NLP) model. To gain a better performance, this paper proposes a framework named ExBERT to accurately predict if a vulnerability will be exploited or not. ExBERT essentially is an improved Bidirectional Encoder Representations from Transformers (BERT) model for exploitability prediction. First, we fine-tune a pre-trained BERT using collected domain-specific corpus. Then, we design a Pooling Layer and a Classification Layer on top of the fine-tuned BERT model to extract sentence-level semantic features and predict the exploitability of vulnerabilities. Results on 46,176 real-word vulnerabilities have demonstrated that the proposed ExBERT framework achieves 91.12% on accuracy and 91.82% on precision, outperforming the state-of-the-art approach with 89.0% on accuracy and 81.8% on precision.
ISSN:	0950-7051 1872-7409
DOI:	10.1016/j.knosys.2020.106529