Predicting Intrusion in a Network Traffic Using Variance of Neighboring Object’s Distance

Full Text (PDF, 440KB), PP.73-84

Views: 0 Downloads: 0

Author(s)

Krishna Gopal Sharma 1,* Yashpal Singh 2

1. Dr. A.P.J. Abdul Kalam Technical University / CSE, Lucknow, India

2. Bundelkhand Institute of Engineering & Technology / CSE, Jhansi, India

* Corresponding author.

DOI: https://doi.org/10.5815/ijcnis.2023.02.06

Received: 11 Feb. 2022 / Revised: 4 Jun. 2022 / Accepted: 20 Aug. 2022 / Published: 8 Apr. 2023

Index Terms

Intrusion Detection, Prediction, Machine Learning, Binary Classification, K-distance, KNN, KDD, Variance

Abstract

Activities in network traffic can be broadly classified into two categories: normal and malicious. Malicious activities are harmful and their detection is necessary for security reasons. The intrusion detection process monitors network traffic to identify malicious activities in the system. Any algorithm that divides objects into two categories, such as good or bad, is a binary class predictor or binary classifier. In this paper, we utilized the Nearest Neighbor Distance Variance (NNDV) classifier for the prediction of intrusion. NNDV is a binary class predictor and uses the concept of variance on the distance between objects. We used KDD CUP 99 dataset to evaluate the NNDV and compared the predictive accuracy of NNDV with that of the KNN or K Nearest Neighbor classifier. KNN is an efficient general purpose classifier, but we only considered its binary aspect. The results are quite satisfactory to show that NNDV is comparable to KNN. Many times, the performance of NNDV is better than KNN. We experimented with normalized and unnormalized data for NNDV and found that the accuracy results are generally better for normalized data. We also compared the accuracy results of different cross validation techniques such as 2 fold, 5 fold, 10 fold, and leave one out on the NNDV for the KDD CUP 99 dataset. Cross validation results can be helpful in determining the parameters of the algorithm.

Cite This Paper

Krishna Gopal Sharma, Yashpal Singh, "Predicting Intrusion in a Network Traffic Using Variance of Neighboring Object’s Distance", International Journal of Computer Network and Information Security(IJCNIS), Vol.15, No.2, pp.73-84, 2023. DOI:10.5815/ijcnis.2023.02.06

Reference

[1] O. Atilla and E. Hamit, “A review of KDD99 dataset usage in intrusion detection and machine learning between 2010 and 2015,” PeerJ, pp. 0–21, 2016.
[2] C. E. Landwehr, A. R. Bull, J. P. McDermott, and W. S. Choi, “A taxonomy of computer program security flaws,” ACM Comput. Surv., vol. 26, no. 3, pp. 211–254, 1994.
[3] V. Kumar, J. Srivastava, and A. Lazarevic, Managing cyber threats: issues, approaches, and challenges, vol. 5. Springer Science & Business Media, 2006.
[4] “Machine Learning: What it is and why it matters | SAS India.” https://www.sas.com/en_in/insights/analytics/machine-learning.html (accessed Jan. 05, 2022).
[5] Anil Kumar K.M, Bhargava S, Apoorva R, Jemal Abawajy, "Detection of False Income Level Claims Using Machine Learning", International Journal of Modern Education and Computer Science, Vol.14, No.1, pp. 65-77, 2022
[6] Nazia Tazeen, K. Sandhya Rani, "A Novel Ant Colony Based DBN Framework to Analyze the Drug Reviews", International Journal of Intelligent Systems and Applications, Vol.13, No.6, pp.25-39, 2021.
[7] Md. Rahat Khan, A. S. M. Shafi, " Statistical Texture Features Based Automatic Detection and Classification of Diabetic Retinopathy", International Journal of Image, Graphics and Signal Processing, Vol.13, No.2, pp. 53-61, 2021.
[8] Ekta Thirani, Jayshree Jain, Vaibhav Narawade, " Enhancing Performance Evaluation for Video Plagiarism Detection Using Local Feature through SVM and KNN algorithm", International Journal of Image, Graphics and Signal Processing, Vol.13, No.5, pp. 41-50, 2021.
[9] Shwetha S.V., Dharmanna L., "An Automatic Recognition, Identification and Classification of Mitotic Cells for the Diagnosis of Breast Cancer Stages", International Journal of Image, Graphics and Signal Processing, Vol.13, No.6, pp. 1-11, 2021.
[10] “Similarity Machine Learning | Nearest Neighbor ML | simMachines.” https://simmachines.com/similarity-based-machine-learning-provides-ai-transparency-trust/ (accessed Jan. 13, 2022).
[11] M. Gopal, Applied Machine Learning. New Delhi: McGraw-Hill Education, 2018.
[12] N. Kumar and U. Kumar, “Artificial intelligence for classification and regression tree based feature selection method for network intrusion detection system in various telecommunication technologies,” Comput. Intell., doi: https://doi.org/10.1111/coin.12500.
[13] T. Jain and C. Gupta, “Multi-Agent Intrusion Detection System Using Sparse PSO K-Mean Clustering and Deep Learning,” pp. 91–102, 2022, doi: 10.1007/978-981-16-6332-1_10.
[14] R. Du, Y. Li, X. Liang, J. T.-M. N. and Applications, and undefined 2022, “Support vector machine intrusion detection scheme based on cloud-fog collaboration,” Springer, Accessed: Mar. 13, 2022.
[15] S. Norwahidayah, N. Farahah, A. Amirah, N. Liyana, N. Suhana, and others, “Performances of Artificial Neural Network (ANN) and Particle Swarm Optimization (PSO) Using KDD Cup ‘99 Dataset in Intrusion Detection System (IDS),” in Journal of Physics: Conference Series, 2021, vol. 1874, no. 1, p. 12061.
[16] P. J. Sajith and G. Nagarajan, “Optimization of BPN Parameters Using PSO for Intrusion Detection in Cloud Environment,” 2022.
[17] M. Imran, N. Haider, M. Shoaib, I. Razzak, and others, “An intelligent and efficient network intrusion detection system using deep learning,” Comput. \& Electr. Eng., vol. 99, p. 107764, 2022.
[18] S. K. Sahu, D. P. Mohapatra, J. K. Rout, K. S. Sahoo, Q.-V. Pham, and N.-N. Dao, “A LSTM-FCNN based multi-class intrusion detection using scalable framework,” Comput. \& Electr. Eng., vol. 99, p. 107720, 2022.
[19] S. El-Sappagh, A. S. Mohammed, and T. A. AlSheshtawy, “Classification procedures for intrusion detection based on KDD CUP 99 data set,” Int. J. Netw. Secur. \& Its Appl. Vol, vol. 11, 2019.
[20] A. Javaid, Q. Niyaz, W. Sun, and M. Alam, “A deep learning approach for network intrusion detection system,” Eai Endorsed Trans. Secur. Saf., vol. 3, no. 9, p. e2, 2016.
[21] G. Liu and J. Zhang, “CNID: research of network intrusion detection based on convolutional neural network,” Discret. Dyn. Nat. Soc., vol. 2020, 2020.
[22] L. Ning, “Network intrusion classification based on probabilistic neural network,” in 2013 International Conference on Computational and Information Sciences, 2013, pp. 57–59.
[23] S. Choudhary and N. Kesswani, “Analysis of KDD-Cup’99, NSL-KDD and UNSW-NB15 datasets using deep learning in IoT,” Procedia Comput. Sci., vol. 167, pp. 1561–1573, 2020.
[24] R. Gopi et al., “Enhanced method of ANN based model for detection of DDoS attacks on multimedia internet of things,” Multimed. Tools Appl., vol. 81, no. 19, pp. 26739–26757, 2022.
[25] Y. Li and L. Guo, “An active learning based TCM-KNN algorithm for supervised network intrusion detection,” Comput. \& Secur., vol. 26, no. 7–8, pp. 459–467, 2007.
[26] N. I. Mowla, J. Rosell, and A. Vahidi, “Dynamic Voting based Explainable Intrusion Detection System for In-vehicle Network,” in 2022 24th International Conference on Advanced Communication Technology (ICACT), 2022, pp. 406–411.
[27] H. Suhaimi, S. I. Suliman, I. Musirin, A. F. Harun, and R. Mohamad, “Network intrusion detection system by using genetic algorithm,” Indones. J. Electr. Eng. Comput. Sci., vol. 16, no. 3, p. 1593, 2019.
[28] M. T. Nguyen and K. Kim, “Genetic convolutional neural network for intrusion detection systems,” Futur. Gener. Comput. Syst., vol. 113, pp. 418–427, 2020.
[29] J. O. Mebawondu, O. D. Alowolodu, J. O. Mebawondu, and A. O. Adetunmbi, “Network intrusion detection system using supervised learning paradigm,” Sci. African, vol. 9, p. e00497, 2020.
[30] S. Kumar, S. Gupta, and S. Arora, “A comparative simulation of normalization methods for machine learning-based intrusion detection systems using KDD Cup’99 dataset,” J. Intell. \& Fuzzy Syst., no. Preprint, pp. 1–18.
[31] “Pattern Recognition Tools.” http://37steps.com/4370/nn-rule-invention/ (accessed May 01, 2022).
[32] M. Cover T and E. Hart P, “Nearest Neighbor Pattern Classification,” IEEE Trans. Inf. Theory, vol. 13, no. 1, pp. 21–27, 1967.
[33] COVER TM, “Rates of Convergence for Nearest Neighbor Procedures.” pp. 413–415, 1968.
[34] “K-nearest neighbor - Scholarpedia.” http://www.scholarpedia.org/article/K-nearest_neighbor (accessed Jun. 01, 2022).
[35] E. Fix , Hodges, J. L., USAF School of Aviation Medicine., Discriminatory analysis : nonparametric discrimination, consistency properties. Randolph Field, Tex.: USAF School of Aviation Medicine, 1951.
[36] K. Fukunaga and L. D. Hostetler, “k-Nearest-Neighbor Bayes-Risk Estimation,” IEEE Trans. Inf. Theory, vol. 21, no. 3, pp. 285–293, 1975, doi: 10.1109/TIT.1975.1055373.
[37] S. A. Dudani, “The distance-weighted k-nearest-neighbor rule,” IEEE Trans. Syst. Man. Cybern., no. 4, pp. 325–327, 1976.
[38] S. Bermejo and J. Cabestany, “Adaptive soft k-nearest-neighbour classifiers,” Pattern Recognit., vol. 33, no. 12, pp. 1999–2005, 2000.
[39] A. Jóźwik, “A learning scheme for a fuzzy k-NN rule,” Pattern Recognit. Lett., vol. 1, no. 5–6, pp. 287–289, 1983.
[40] J. M. Keller and M. R. Gray, “A Fuzzy K-Nearest Neighbor Algorithm,” IEEE Trans. Syst. Man Cybern., vol. SMC-15, no. 4, pp. 580–585, 1985, doi: 10.1109/TSMC.1985.6313426.
[41] S. J. Russell and P. Norvig, Artificial Intelligence: a modern approach, 3rd ed. Pearson, 2009.
[42] J. Han Kamber, Micheline., Pei, Jian., “Data mining concepts and techniques, third edition.” Morgan Kaufmann Publishers, Waltham, Mass., 2012, [Online]. Available: http://www.books24x7.com/marc.asp?bookid=44712.
[43] M. M. Breunig, H.-P. Kriegel, R. T. Ng, and J. Sander, “LOF: identifying density-based local outliers,” in Proceedings of the 2000 ACM SIGMOD international conference on Management of data, 2000, pp. 93–104.
[44] “Variance.” https://en.wikipedia.org/wiki/Variance (accessed Jan. 05, 2022).
[45] K. G. Sharma, A. Ram, and Y. Singh, “Efficient density based outlier handling technique in data mining,” in International Conference on Computer Science and Information Technology, 2011, pp. 542–550.
[46] “UCI Machine Learning Repository: KDD Cup 1999 Data Data Set.” https://archive.ics.uci.edu/ml/datasets/kdd+cup+1999+data.
[47] S. J. Stolfo, W. Fan, W. Lee, A. Prodromidis, and P. K. Chan, “Cost-based modeling for fraud and intrusion detection: Results from the JAM project,” in Proceedings DARPA Information Survivability Conference and Exposition. DISCEX’00, 2000, vol. 2, pp. 130–144.
[48] R. P. Lippmann et al., “Evaluating intrusion detection systems: The 1998 DARPA off-line intrusion detection evaluation,” in Proceedings DARPA Information Survivability Conference and Exposition. DISCEX’00, 2000, vol. 2, pp. 12–26.
[49] M. Tavallaee, E. Bagheri, W. Lu, and A. A. Ghorbani, “A detailed analysis of the KDD CUP 99 data set,” in 2009 IEEE symposium on computational intelligence for security and defense applications, 2009, pp. 1–6.
[50] S. Ganapathy, K. Kulothungan, S. Muthurajkumar, M. Vijayalakshmi, P. Yogesh, and A. Kannan, “Intelligent feature selection and classification techniques for intrusion detection in networks: a survey,” EURASIP J. Wirel. Commun. Netw., vol. 2013, no. 1, pp. 1–16, 2013.
[51] C. Kolias, G. Kambourakis, and M. Maragoudakis, “Swarm intelligence in intrusion detection: A survey,” Comput. \& Secur., vol. 30, no. 8, pp. 625–642, 2011.
[52] “Metrics and scoring.” https://scikit-learn.org/stable/modules/model_evaluation.html#accuracy-score (accessed Jun. 01, 2022).
[53] “Percentile - Wikipedia.” https://en.wikipedia.org/wiki/Percentile (accessed Jul. 01, 2022).