Argha Chandra Dhar; Arna Roy; M. A. H. Akhand; Md. Abdus Samad Kamal; Kou Yamada

Cascaded Machine Learning Approach with Data Augmentation for Intrusion Detection System

PDF (884KB), PP.17-30

Views: 0 Downloads: 0

Author(s)

Argha Chandra Dhar ¹ Arna Roy ¹ M. A. H. Akhand ^1,* Md. Abdus Samad Kamal ² Kou Yamada ²

1. Department of Computer Science and Engineering, Khulna University of Engineering & Technology, Khulna-9203, Bangladesh

2. Graduate School of Science and Technology, Gunma University, Kiryu 376-8515, Japan

* Corresponding author.

DOI: https://doi.org/10.5815/ijcnis.2024.04.02

Received: 6 Feb. 2023 / Revised: 20 Apr. 2023 / Accepted: 25 May 2023 / Published: 8 Aug. 2024

Index Terms

Cascaded Framework, Classification, Data Augmentation, Intrusion Detection System, Neural Network

Abstract

Cybersecurity has received significant attention globally, with the ever-continuing expansion of internet usage, due to growing trends and adverse impacts of cybercrimes, which include disrupting businesses, corrupting or altering sensitive data, stealing or exposing information, and illegally accessing a computer network. As a popular way, different kinds of firewalls, antivirus systems, and Intrusion Detection Systems (IDS) have been introduced to protect a network from such attacks. Recently, Machine Learning (ML), including Deep Learning (DL) based autonomous systems, have been state-of-the-art in cyber security, along with their drastic growth and superior performance. This study aims to develop a novel IDS system that gives more attention to classifying attack cases correctly and categorizes attacks into subclass levels by proposing a two-step process with a cascaded framework. The proposed framework recognizes the attacks using one ML model and classifies them into subclass levels using the other ML model in successive operations. The most challenging part is to train both models with unbalanced cases of attacks and non-attacks in the datasets, which is overcome by proposing a data augmentation technique. Precisely, limited attack samples of the dataset are augmented in the training set to learn the attack cases properly. Finally, the proposed framework is implemented with NN, the most popular ML model, and evaluated with the NSL-KDD dataset by conducting a rigorous analysis of each subclass emphasizing the major attack class. The proficiency of the proposed cascaded approach with data augmentation is compared with the other three models: the cascaded model without data augmentation and the standard single NN model with and without the data augmentation technique. Experimental results on the NSL-KDD dataset have revealed the proposed method as an effective IDS system and outperformed existing state-of-the-art ML models.

Cite This Paper

Argha Chandra Dhar, Arna Roy, M. A. H. Akhand, Md. Abdus Samad Kamal, Kou Yamada, "Cascaded Machine Learning Approach with Data Augmentation for Intrusion Detection System", International Journal of Computer Network and Information Security(IJCNIS), Vol.16, No.4, pp.17-30, 2024. DOI:10.5815/ijcnis.2024.04.02

Reference

[1]Y. Li and Q. Liu, “A comprehensive review study of cyber-attacks and cyber security; Emerging trends and recent developments,” Energy Reports, 2021, doi: 10.1016/j.egyr.2021.08.126.
[2]M. Z. Gunduz and R. Das, “Cyber-security on smart grid: Threats and potential solutions,” Comput. Networks, vol. 169, p. 107094, Mar. 2020, doi: 10.1016/j.comnet.2019.107094.
[3]M. Yıldırım and I. Mackie, “Encouraging users to improve password security and memorability,” Int. J. Inf. Secur., vol. 18, no. 6, pp. 741–759, Dec. 2019, doi: 10.1007/s10207-019-00429-y.
[4]A. Awasthi and N. Goel, “Phishing website prediction using base and ensemble classifier techniques with cross-validation,” Cybersecurity, vol. 5, no. 1, p. 22, Nov. 2022, doi: 10.1186/s42400-022-00126-9.
[5]S. Aljawarneh, M. Aldwairi, and M. B. Yassein, “Anomaly-based intrusion detection system through feature selection analysis and building hybrid efficient model,” J. Comput. Sci., vol. 25, pp. 152–160, Mar. 2018, doi: 10.1016/j.jocs.2017.03.006.
[6]Z. Ahmad, A. Shahid Khan, C. Wai Shiang, J. Abdullah, and F. Ahmad, “Network intrusion detection system: A systematic study of machine learning and deep learning approaches,” Trans. Emerg. Telecommun. Technol., vol. 32, no. 1, Jan. 2021, doi: 10.1002/ett.4150.
[7]N. Awadallah Awad, “Enhancing Network Intrusion Detection Model Using Machine Learning Algorithms,” Comput. Mater. Contin., vol. 67, no. 1, pp. 979–990, 2021, doi: 10.32604/cmc.2021.014307.
[8]A. Divekar, M. Parekh, V. Savla, R. Mishra, and M. Shirole, “Benchmarking datasets for Anomaly-based Network Intrusion Detection: KDD CUP 99 alternatives,” in 2018 IEEE 3rd International Conference on Computing, Communication and Security (ICCCS), 2018, pp. 1–8, doi: 10.1109/CCCS.2018.8586840.
[9]B. Ingre, A. Yadav, and A. K. Soni, “Decision Tree Based Intrusion Detection System for NSL-KDD Dataset,” in Smart Innovation, Systems and Technologies, 2018, pp. 207–218.
[10]S. M. Kasongo and Y. Sun, “A Deep Learning Method With Filter Based Feature Engineering for Wireless Intrusion Detection System,” IEEE Access, vol. 7, pp. 38597–38607, 2019, doi: 10.1109/ACCESS.2019.2905633.
[11]A. KumarShrivas and A. Kumar Dewangan, “An Ensemble Model for Classification of Attacks with Feature Selection based on KDD99 and NSL-KDD Data Set,” Int. J. Comput. Appl., vol. 99, no. 15, pp. 8–13, Aug. 2014, doi: 10.5120/17447-5392.
[12]C. Liu, Z. Gu, and J. Wang, “A Hybrid Intrusion Detection System Based on Scalable K-Means+ Random Forest and Deep Learning,” IEEE Access, vol. 9, pp. 75729–75740, 2021, doi: 10.1109/ACCESS.2021.3082147.
[13]P. Sangkatsanee, N. Wattanapongsakorn, and C. Charnsripinyo, “Practical real-time intrusion detection using machine learning approaches,” Comput. Commun., vol. 34, no. 18, pp. 2227–2235, Dec. 2011, doi: 10.1016/j.comcom.2011.07.001.
[14]M. Sarnovsky and J. Paralic, “Hierarchical Intrusion Detection Using Machine Learning and Knowledge Model,” Symmetry (Basel)., vol. 12, no. 2, p. 203, Feb. 2020, doi: 10.3390/sym12020203.
[15]N. Satheesh et al., “Flow-based anomaly intrusion detection using machine learning model with software defined networking for OpenFlow network,” Microprocess. Microsyst., vol. 79, p. 103285, Nov. 2020, doi: 10.1016/j.micpro.2020.103285.
[16]S. P. Thirimanne, L. Jayawardana, L. Yasakethu, P. Liyanaarachchi, and C. Hewage, “Deep Neural Network Based Real-Time Intrusion Detection System,” SN Comput. Sci., vol. 3, no. 2, p. 145, Mar. 2022, doi: 10.1007/s42979-022-01031-1.
[17]M. Almi’ani, A. A. Ghazleh, A. Al-Rahayfeh, and A. Razaque, “Intelligent intrusion detection system using clustered self organized map,” in 2018 Fifth International Conference on Software Defined Systems (SDS), 2018, pp. 138–144, doi: 10.1109/SDS.2018.8370435.
[18]M. Tavallaee, E. Bagheri, W. Lu, and A. A. Ghorbani, “A detailed analysis of the KDD CUP 99 data set,” in 2009 IEEE Symposium on Computational Intelligence for Security and Defense Applications, 2009, pp. 1–6, doi: 10.1109/CISDA.2009.5356528.
[19]Y. Ding and Y. Zhai, “Intrusion Detection System for NSL-KDD Dataset Using Convolutional Neural Networks,” in Proceedings of the 2018 2nd International Conference on Computer Science and Artificial Intelligence - CSAI ’18, 2018, pp. 81–85, doi: 10.1145/3297156.3297230.
[20]H. Zhou, L. Kang, H. Pan, G. Wei, and Y. Feng, “An intrusion detection approach based on incremental long short-term memory,” Int. J. Inf. Secur., vol. 22, no. 2, pp. 433–446, Apr. 2023, doi: 10.1007/s10207-022-00632-4.
[21]Z. K. Maseer, R. Yusof, N. Bahaman, S. A. Mostafa, and C. F. M. Foozy, “Benchmarking of Machine Learning for Anomaly Based Intrusion Detection Systems in the CICIDS2017 Dataset,” IEEE Access, vol. 9, pp. 22351–22370, 2021, doi: 10.1109/ACCESS.2021.3056614.
[22]M. A. Khan, “HCRNNIDS: Hybrid Convolutional Recurrent Neural Network-Based Network Intrusion Detection System,” Processes, vol. 9, no. 5, p. 834, May 2021, doi: 10.3390/pr9050834.
[23]A. M. Mahfouz, D. Venugopal, and S. G. Shiva, “Comparative Analysis of ML Classifiers for Network Intrusion Detection,” in Advances in Intelligent Systems and Computing, 2020, pp. 193–207.
[24]A. Lamba, S. Singh, S. Bhardwaj, N. Dutta, and S. S. R. Muni, “Uses of Artificial Intelligent Techniques to Build Accurate Models for Intrusion Detection System,” SSRN Electron. J., vol. 2, no. 12, pp. 5826–5830, 2015, doi: 10.2139/ssrn.3492675.
[25]L. Dhanabal and S. P. Shantharajah, “A Study on NSL-KDD Dataset for Intrusion Detection System Based on Classification Algorithms,” Int. J. Adv. Res. Comput. Commun. Eng., vol. 4, no. 6, pp. 446–452, 2015.
[26]J. Kevric, S. Jukic, and A. Subasi, “An effective combining classifier approach using tree algorithms for network intrusion detection,” Neural Comput. Appl., vol. 28, no. S1, pp. 1051–1058, Dec. 2017, doi: 10.1007/s00521-016-2418-1.
[27]S. A. Ludwig, “Applying a Neural Network Ensemble to Intrusion Detection,” J. Artif. Intell. Soft Comput. Res., vol. 9, no. 3, pp. 177–188, Jul. 2019, doi: 10.2478/jaiscr-2019-0002.
[28]B. Ingre and A. Yadav, “Performance analysis of NSL-KDD dataset using ANN,” in 2015 International Conference on Signal Processing and Communication Engineering Systems, 2015, pp. 92–96, doi: 10.1109/SPACES.2015.7058223.
[29]H. Alavizadeh, H. Alavizadeh, and J. Jang-Jaccard, “Deep Q-Learning Based Reinforcement Learning Approach for Network Intrusion Detection,” Computers, vol. 11, no. 3, p. 41, Mar. 2022, doi: 10.3390/computers11030041.

International Journal of Computer Network and Information Security (IJCNIS)