Loading…
AndroDex: Android Dex Images of Obfuscated Malware
With the emergence of technology and the usage of a large number of smart devices, cyber threats are increasing. Therefore, research studies have shifted their attention to detecting Android malware in recent years. As a result, a reliable and large-scale malware dataset is essential to build effect...
Saved in:
Published in: | Scientific data 2024-02, Vol.11 (1), p.212-10, Article 212 |
---|---|
Main Authors: | , , , , |
Format: | Article |
Language: | English |
Subjects: | |
Citations: | Items that this one cites |
Online Access: | Get full text |
Tags: |
Add Tag
No Tags, Be the first to tag this record!
|
Summary: | With the emergence of technology and the usage of a large number of smart devices, cyber threats are increasing. Therefore, research studies have shifted their attention to detecting Android malware in recent years. As a result, a reliable and large-scale malware dataset is essential to build effective malware classifiers. In this paper, we have created AndroDex: an Android malware dataset containing a total of 24,746 samples that belong to more than 180 malware families. These samples are based on .dex images that truly reflect the characteristics of malware. To construct this dataset, we first downloaded the APKs of the malware, applied obfuscation techniques, and then converted them into images. We believe this dataset will significantly enhance a series of research studies, including Android malware detection and classification, and it will also boost deep learning classification efforts, among others. The main objective of creating images based on the Android dataset is to help other malware researchers better understand how malware works. Additionally, an important result of this study is that most malware nowadays employs obfuscation techniques to hide their malicious activities. However, malware images can overcome such issues. The main limitation of this dataset is that it contains images based on .dex files that are based on static analysis. However, dynamic analysis takes time, therefore, to overcome the issue of time and space this dataset can be used for the initial examination of any .apk files. |
---|---|
ISSN: | 2052-4463 2052-4463 |
DOI: | 10.1038/s41597-024-03027-3 |