Loading…
Defending unknown attacks on cyber-physical systems by semi-supervised approach and available unlabeled data
Cyber-physical systems (CPS) are used increasingly in modern industrial systems. These systems currently encounter a significant threat of malicious activities created by malicious software intent on exploiting the fact that the software of such industrial systems is integrated with hardware and net...
Saved in:
Published in: | Information sciences 2017-02, Vol.379, p.211-228 |
---|---|
Main Authors: | , , , , , , |
Format: | Article |
Language: | English |
Subjects: | |
Citations: | Items that this one cites Items that cite this one |
Online Access: | Get full text |
Tags: |
Add Tag
No Tags, Be the first to tag this record!
|
Summary: | Cyber-physical systems (CPS) are used increasingly in modern industrial systems. These systems currently encounter a significant threat of malicious activities created by malicious software intent on exploiting the fact that the software of such industrial systems is integrated with hardware and network systems. Malicious codes dynamically and continuously change their internal structure and attack patterns using obfuscation techniques, such as polymorphism and metamorphism, in order to bypass and hide from conventional malware detection engines. This requires continuously updating the database of the malware detection engine, which requires periodic effort from manual experts. This could limit the real-time protection of CPS. In addition, this also makes preserving the availability and integrity of the services provided by CPS against malicious code challenging because there is a demand for the development of specialized malware detection techniques for CPS.
In this paper, we propose a semi-supervised approach that automatically integrates the knowledge about unknown malware from already available and cheap unlabeled data into the detection system. The novelty of the proposed approach is that it does not require expert effort to update the database of the detection engine. Instead, the dynamic changes in malware attack patterns are extracted by unsupervised clustering from already available unlabeled data. Then the extracted geometric information about the intrinsic attack characteristics of the clusters is integrated into the classification systems of the detection engine, which updates the detection system automatically. The proposed approach uses global K-means clustering with term-frequency (TF), inverse document frequency (IDF), and cosine similarity as a distance measure for extracting the cluster information and adding it to a support vector machine (SVM) classification system. The proposed approach has been tested extensively on a real malware data set for both static and dynamic malware features. The experiment results show that the proposed semi-supervised approach achieves higher accuracy over the existing supervised approaches for all classifiers. We note that the static feature-based semi-supervised approach can improve detection accuracy significantly. While applying the proposed semi-supervised approach with the run-time characteristics of dynamic feature analysis, the combined effect of dynamic analysis and the proposed approach further in |
---|---|
ISSN: | 0020-0255 1872-6291 |
DOI: | 10.1016/j.ins.2016.09.041 |