Malware Multi-Class Classification based on Malware Visualization using a Convolutional Neural Network Model

Full Text (PDF, 610KB), PP.20-29

Views: 0 Downloads: 0

Author(s)

Balram Yadav 1,* Sanjiv Tokekar 2

1. Computer Engineering, Institute of Engineering and Technology, DAVV, Indore-452017, India

2. Director, and Head of Department, Electronics and Telecommunication Engineering, Institute of Engineering and Technology, DAVV, Indore-452017, India

* Corresponding author.

DOI: https://doi.org/10.5815/ijieeb.2023.02.03

Received: 26 Aug. 2022 / Revised: 25 Oct. 2022 / Accepted: 10 Dec. 2022 / Published: 8 Apr. 2023

Index Terms

Convolutional neural network, CNN, Deep learning, DL, DL models, Malware, Malware classification, Malware visualization

Abstract

Malware classification has already been a prominent concern for decades, and malware attacks have proliferated at an astounding rate, constituting a significant threat to cyberspace. Deep learning (DL) and malware image approaches are becoming more prevalent in the field of malware analysis, with spectacular results. This work focuses on the challenge of classifying malware variants that are represented as images. This study employs visualization and proposes a convolutional neural network (CNN) based DL model to effectively and accurately classify malware. The proposed model is trained and tested on a very challenging and heterogeneous dataset, and it achieves accuracy of 98.179%, precision of 97.39%, a F1-score of 97.70%, and a fast classification speed (3 seconds needed to test 934 unseen malware). This demonstrates the proposed model's incredibly quick, effective and accurate performance. The proposed model outperformed existing traditional DL models in terms of various performance measures and demonstrated its usefulness in classifying malware families through visualization. This study and experimental results reveal that small-scale malware images and a simple CNN architecture alone are capable of accurately classifying malware families with high classification accuracy.

Cite This Paper

Balram Yadav, Sanjiv Tokekar, "Malware Multi-Class Classification based on Malware Visualization using a Convolutional Neural Network Model", International Journal of Information Engineering and Electronic Business(IJIEEB), Vol.15, No.2, pp. 20-29, 2023. DOI:10.5815/ijieeb.2023.02.03

Reference

[1]Yadav B., Tokekar S. (2021) ‘Deep Learning in Malware Identification and Classification’, In: Stamp M., Alazab M., Shalaginov A. (eds) Malware Analysis Using Artificial Intelligence and Deep Learning. Springer, Cham, pp. 163-205. https://doi.org/10.1007/978-3-030-62582-5_6
[2]Hamad,N. (2019)‘Detection of Malicious Activities in Internet of Things Environment Based on Binary Visualization and Machine Intelligence’,Wireless Personal Communications, Vol. 108, pp.2609-2629. https://doi.org/10.1007/s11277-019- 06540-6
[3]Singh, A., Handa, A., Kumar, N. and Shukla, S.K. (2019) ‘Malware classification using image representation’, In International Symposium on Cyber Security Cryptography and Machine Learning, pp. 75-92. https://doi.org/10.1007/978-3-030-20951-3_6
[4]Malware statistics and Trends Report (2020) [online] by AV-test institute.https://www.av-test.org/en/statistics/malware/(Accessed 25 January 2021).
[5]McAfee labs threats report [online] November 2020. https://www.mcafee.com/enterprise/enus/assets/reports/rpquarterly-threats-nov2020.pdf. (Accessed 25 January 2021).
[6]Agarap, A.F. (2017) ‘Towards building an intelligent anti-malware system: a deep learning approach using support vector machine (SVM) for malware classification’, ArXivpreprint, arXiv: 1801.00318.
[7]Lad, S.S. and Adamuthe, A.C. (2020)‘Malware Classification with Improved Convolutional Neural Network Model’, International Journal of Computer Network & Information Security, Vol. 12, No. 6, pp. 30- 43. https://doi.org/10.5815/ijcnis.2020.06.03
[8]Nataraj, L., Karthikeyan, S., Jacob, G. and Manjunath, B.S. (2011)‘Malware images: visualization and automatic classification’, In Proceedings of the 8th international symposium on visualization for cyber security, pp. 1- 7. https://doi.org/10.1145/2016904.2016908
[9]Kosmidis, K.and Kalloniatis, C. (2017) ‘Machine Learning and Images for Malware Detection and Classification’,21st Pan-Hellenic Conference on Informatics, pp. 1– 6. https://doi.org/10.1145/3139367.3139400
[10]Vinayakumar, R., Alazab, M., Soman, K.P., Poornachandran, P. and Venkatraman, S. (2019) ‘Robust intelligent malware detection using deep learning’, IEEE Access, Vol. 7, pp.46717-46738. https://doi.org/ 10.1109/ACCESS.2019.2906934
[11]Kalash, M., Rochan, M., Mohammed, N., Bruce, N.D., Wang, Y. and Iqbal, F. (2018) ‘Malware classification with deep convolutional neural networks’, In 9th IFIP International Conference on New Technologies, Mobility and Security (NTMS), pp. 1-5. https://doi.org/10.1109/NTMS.2018.8328749
[12]Mourtaji, Y., Bouhorma, M. and Alghazzawi, D. (2019) ‘Intelligent framework for malware detection with convolutional neural network’, In Proceedings of the 2nd International Conference on Networking, Information Systems & Security, pp. 1- 6. https://doi.org/10.1145/3320326.3320333
[13]Kumar, R., Xiaosong, Z., Khan, R.U., Ahad, I. and Kumar, J. (2018) ‘Malicious code detection based on image processing using deep learning’, In Proceedings of the 2018 International Conference on Computing and Artificial Intelligence, pp. 81- 85. https://doi.org/10.1145/3194452.3194459
[14]Jain, M., Andreopoulos, W. and Stamp, M. (2020) ‘Convolutional neural networks and extreme learning machines for malware classification’, Journal of Computer Virology and Hacking Techniques, Vol. 16, No. 3, pp.229- 244. https://doi.org/10.1007/s11416-020-00354-y
[15]Ansari, M.A. and Singh, D.K. (2021) ‘Monitoring social distancing through human detection for preventing/reducing COVID spread’, International Journal of Information Technology, Vol. 13, No. 3, pp.1255-1264. https://doi.org/10.1007/s41870-021-00658-2
[16]Kabanga, E.K. and Kim, C.H. (2017) ‘Malware images classification using convolutional neural network’, Journal of Computer and Communications, Vol. 6, No. 1, pp.153- 158. https://doi.org/10.4236/jcc.2018.61016
[17]Mitsuhashi, R. and Shinagawa, T. (2020) ‘High-accuracy malware classification with a malware-optimized deep learning model’, ArXiv preprint,arXiv: 2004.05258.
[18]Vasan, D., Alazab, M., Wassan, S., Safaei, B. and Zheng, Q. (2020) ‘Image-Based malware classification using ensemble of CNN architectures(IMCEC)’, Computers & Security, Vol. 92, pp.101748. https://doi.org/10.1016/j.cose.2020.101748
[19]GavriluĊ£, D., CimpoeĊŸu, M., Anton, D. and Ciortuz, L. (2009) ‘Malware detection using machine learning’, In International Multiconference on Computer Science and Information Technology, pp. 735-741. https://doi.org/10.1109/IMCSIT.2009.5352759
[20]Simonyan, K. and Zisserman, A. (2014) ‘Very deep convolutional networks for largescale image recognition’, ArXiv preprint,arXiv: 1409.1556.
[21]Lu, Y. and Li, J. (2019) ‘Generative adversarial network for improving deep learning based malware classification’, In Winter Simulation Conference (WSC), pp. 584- 593. https://doi.org/10.1109/WSC40007.2019.9004932
[22]Yue, S. (2017) ‘Imbalanced malware images classification: a cnn based approach’, ArXiv preprint, arXiv: 1708.08042.
[23]Cui, Z., Xue, F., Cai, X., Cao, Y., Wang, G.G. and Chen, J. (2018) ‘Detection of malicious code variants based on deep learning’, IEEE Transactions on Industrial Informatics, Vol. 14, No. 7, pp.3187- 3196. https://doi.org/10.1109/TII.2018.2822680
[24]Malimg dataset (2011) Based on grayscale images [online].https://www.kaggle.com/afagarap/malimg-dataset. (Accessed 25 January 2021).
[25]Bhodia, N., Prajapati, P., Di Troia, F. and Stamp, M. (2019) ‘Transfer learning for imagebased malware classification’, ArXiv preprint,arXiv: 1903.11551. https://doi.org/10.5220/0007701407190726
[26]Sharma, G.A., Singh, K.J. and Singh, M.D. (2020) ‘A deep learning approach to imagebased malware analysis’, Progress inComputing, Analytics and Networking. AISC, pp.327-339. https://doi.org/10.1007/978-981-15-2414-1_33
[27]Yajamanam, S., Selvin, V.R.S., Di Troia, F. and Stamp, M. (2018) ‘Deep Learning versus Gist Descriptors for Image-based Malware Classification’, In Icissp, pp. 553-561. https://doi.org/10.5220/0006685805530561