Loading…

Malicious sequential pattern mining for automatic malware detection

•An effective framework using sequence mining technique is proposed for automatic malware detection.•An efficient sequential pattern mining algorithm for discovering discriminative patterns between malware and benign samples.•A new nearest neighbor classifier as the detection module to identify unkn...

Full description

Saved in:
Bibliographic Details
Published in:Expert systems with applications 2016-06, Vol.52, p.16-25
Main Authors: Fan, Yujie, Ye, Yanfang, Chen, Lifei
Format: Article
Language:English
Subjects:
Citations: Items that this one cites
Items that cite this one
Online Access:Get full text
Tags: Add Tag
No Tags, Be the first to tag this record!
Description
Summary:•An effective framework using sequence mining technique is proposed for automatic malware detection.•An efficient sequential pattern mining algorithm for discovering discriminative patterns between malware and benign samples.•A new nearest neighbor classifier as the detection module to identify unknown malware.•The strong results of the proposed framework compared with the existing malware detection methods in detecting new malicious samples. Due to its damage to Internet security, malware (e.g., virus, worm, trojan) and its detection has caught the attention of both anti-malware industry and researchers for decades. To protect legitimate users from the attacks, the most significant line of defense against malware is anti-malware software products, which mainly use signature-based method for detection. However, this method fails to recognize new, unseen malicious executables. To solve this problem, in this paper, based on the instruction sequences extracted from the file sample set, we propose an effective sequence mining algorithm to discover malicious sequential patterns, and then All-Nearest-Neighbor (ANN) classifier is constructed for malware detection based on the discovered patterns. The developed data mining framework composed of the proposed sequential pattern mining method and ANN classifier can well characterize the malicious patterns from the collected file sample set to effectively detect newly unseen malware samples. A comprehensive experimental study on a real data collection is performed to evaluate our detection framework. Promising experimental results show that our framework outperforms other alternate data mining based detection methods in identifying new malicious executables.
ISSN:0957-4174
1873-6793
DOI:10.1016/j.eswa.2016.01.002