Loading…

A Method with Pre-trained Word Vectors for Detecting Wordlist-based Malicious Domain Names

In recent years, botnets have used the domain generation algorithm to generate dynamic typified malicious domain names to bypass various detection methods. Given the depth detection model of such domain names, domain names are generally processed by filling and transforming them into a fixed-length...

Full description

Saved in:
Bibliographic Details
Published in:Journal of physics. Conference series 2021-01, Vol.1757 (1), p.12171
Main Authors: Lin, Shaoqing, Zhong, Shangping, Cheng, Kaizhi
Format: Article
Language:English
Subjects:
Citations: Items that this one cites
Items that cite this one
Online Access:Get full text
Tags: Add Tag
No Tags, Be the first to tag this record!
Description
Summary:In recent years, botnets have used the domain generation algorithm to generate dynamic typified malicious domain names to bypass various detection methods. Given the depth detection model of such domain names, domain names are generally processed by filling and transforming them into a fixed-length one-dimensional vector and then classifying them with poor detection performance. Therefore, this study first divides the domain into a word array and converts it into a word vector using pre-trained word vector models, Embeddings from Language Models. The domain is inputted into the TextCNN model for training classification. From approximately 100,000 data sets, a 94.22% accuracy rate and 6.87% FPR value can be obtained from the training. Compared with previous detection models (i.e., LSTM and CNN), more training and testing are needed, but improvements are made in all indicators.
ISSN:1742-6588
1742-6596
DOI:10.1088/1742-6596/1757/1/012171