Skip navigation

AndroDex: Android Dex images of obfuscated malware

AndroDex: Android Dex images of obfuscated malware

Aurangzeb, Sana, Aleem, Muhammad, Khan, Muhammad Taimoor ORCID logoORCID: https://orcid.org/0000-0002-5752-6420, Loukas, George ORCID logoORCID: https://orcid.org/0000-0003-3559-5182 and Sakellari, Georgia ORCID logoORCID: https://orcid.org/0000-0001-7238-8700 (2024) AndroDex: Android Dex images of obfuscated malware. Scientific Data, 11 (212). pp. 1-10. ISSN 2052-4463 (Online) (doi:10.1038/s41597-024-03027-3)

[thumbnail of Publisher VoR]
Preview
PDF (Publisher VoR)
45644_TAIMOOR KHAN_AndroDex_Android_Dex_images_of_obfuscated_malware.pdf - Published Version
Available under License Creative Commons Attribution.

Download (2MB) | Preview

Abstract

With the emergence of technology and the usage of a large number of smart devices, cyber threats are increasing. Therefore, research studies have shifted their attention to detecting Android malware in recent years. As a result, a reliable and large-scale malware dataset is essential to build effective malware classifiers. In this paper, we have created AndroDex: an Android malware dataset containing a total of 24,746 samples that belong to more than 180 malware families. These samples are based on .dex images that truly reflect the characteristics of malware. To construct this dataset, we first downloaded the APKs of the malware, applied obfuscation techniques, and then converted them into images. We believe that this dataset will significantly enhance a series of research studies, including Android malware detection and classification, and it will also boost deep learning classification efforts, among others. The main objective of creating images based on the Android dataset is to help other malware researchers better understand how malware works. Additionally, an important result of this study is that most malware nowadays employs obfuscation techniques to hide their malicious activities. However, malware images can overcome such issues. The main limitation of this dataset is that it contains images based on .dex files that is based on static analysis. However, dynamic analysis takes time, therefore, to overcome the issue of time and space this dataset can be used for the initial examination of any .apk files.

Item Type: Article
Uncontrolled Keywords: malware; Android; cyber attacks
Subjects: Q Science > Q Science (General)
Q Science > QA Mathematics > QA76 Computer software
T Technology > T Technology (General)
Faculty / School / Research Centre / Research Group: Faculty of Engineering & Science
Faculty of Engineering & Science > School of Computing & Mathematical Sciences (CMS)
Last Modified: 20 Feb 2024 14:18
URI: http://gala.gre.ac.uk/id/eprint/45644

Actions (login required)

View Item View Item

Downloads

Downloads per month over past year

View more statistics