Comparative Analysis of Classification Algorithms on KDD’99 Data Set

Full Text (PDF, 531KB), PP.34-40

Views: 0 Downloads: 0

Author(s)

Iknoor Singh Arora 1,* Gurpriya Kaur Bhatia 1 Amrit Pal Singh 2

1. Systems Engineer, Infosys Technologies Ltd, India and USICT, GGSIPU, New-Delhi, India

2. GTBIT, GGSIPU, New-Delhi, India

* Corresponding author.

DOI: https://doi.org/10.5815/ijcnis.2016.09.05

Received: 15 Oct. 2015 / Revised: 3 Feb. 2016 / Accepted: 11 May 2016 / Published: 8 Sep. 2016

Index Terms

Intrusion detection system, Na?ve Bayes, J48, DD`99(Knowledge Discovery and Data Mining)

Abstract

Due to the enormous growth of network based services and the need for secure communications over the network there is an increasing emphasis on improving intrusion detection systems so as to detect the growing network attacks. A lot of data mining techniques have been proposed to detect intrusions in the network. In this paper study of two different classification algorithms has been carried out: Na?ve Bayes and J48. Results obtained after applying these algorithms on 10% of the KDD’99 dataset and on 10% of the filtered KDD’99 dataset are compared and analyzed based on several performance metrics. Comparison between these two algorithms is also done on the basis of the percentage of correctly classified instances of different attack categories present in both the datasets as well as the time they take to build their classification models.Overall J48 is a better classifier compared to Na?ve Bayes on both the datasets but it is slow in building the classification model.

Cite This Paper

Iknoor Singh Arora, Gurpriya Kaur Bhatia, Amrit Pal Singh, "Comparative Analysis of Classification Algorithms on KDD'99 Data Set", International Journal of Computer Network and Information Security(IJCNIS), Vol.8, No.9, pp.34-40, 2016. DOI:10.5815/ijcnis.2016.09.05

Reference

[1]K.Lahre, T. Diwan, P. agrawal, S. K. Kashyap, "Analyze different approaches for IDS using KDD’99 data set”, International Journal on Recent and Innovation Trends in Computing and Communication, August 2013,pp. 645-651.
[2]Bilal Maqbool Beigh, "A New Classification Scheme for Intrusion Detection Systems”, I.J. Computer Network and Information Security, 2014, 8, 56-70
[3]Ashutosh Gupta, Bhoopesh Singh Bhati, Vishal Jain," Artificial Intrusion Detection Techniques: A Survey”, I.J. Computer Network and Information Security, 2014, 9, 51-57.
[4]M. Panda,M. R. Patra, "A comparative study of data mining algorithms for network intrusion detection", First International Conference on Emerging Trends in Engineering and Technology, IEEE, 2008, pp.504-507
[5]F. Gharibian and A. Ghorbani, "Comparative Study of Supervised Machine Learning Techniques for Intrusion Detection", Fifth Annual Conference on Communication Networks and Services Research (CNSR'07), IEEE, 2007.
[6]A. Adebowale, S.A Idowu, A. Amarachi, "Comparative Study of Selected Data Mining Algorithms Used For Intrusion Detection", International Journal of Soft Computing and Engineering (IJSCE),Volume-3, Issue-3, July 2013, pp.237-241.
[7]N. K. Sinha, G. Kumar,K. Kumar, "A Review on Performance Comparison of Artificial Intelligence Techniques Used for Intrusion Detection", International Conference on Communication, Computing & Systems (ICCCS), 2014, pp.209-214.
[8]N. B. Amor,S. Benferhat,Z. Elouedi, "Naive Bayes vs Decision Trees in Intrusion Detection Systems", Symposium on Applied Computing, ACM, 2004, pp 420-424.
[9]W. Lee, S. Stolfo, and K. Mok, "A Data Mining Framework for Building Intrusion Detection Models", Proc, Of the 1999 IEEE Symposium on Security and Privacy, IEEE, May 1999.
[10]N. S. Chandolikar, V. D. Nandavadekar," Efficient Algorithm for Intrusion Attack Classification by
Analyzing KDD Cup 99", Wireless and Optical Communications Networks (WOCN), 2012 Ninth International Conference, IEEE, Sept. 2012, pp 1-5.
[11]T. R. Patil, S. S. Sherekar,"Performance Analysis of Naive Bayes and J48 Classification Algorithm for Data Classification", International Journal Of Computer Science And Applications Vol. 6, No.2, pp-256-261, Apr 2013
[12]Kamarularifin Abd Jalil, Mohamad Noorman Masrek, “Comparison of Machine Learning Algorithms Performance in Detecting Network Intrusion”, International Conference on Networking and Information Technology, IEEE, 2010, pp 221-226.
[13]Tom M. Mitchell, Machine Learning, McGrawHill, 2015
[14]Tan et al, Classification, 3rded., vol 1, Gerstein Lab, 2005.
[15]Data Preprocessing. Available on: https://www.techopedia.com/definition/14650/data-preprocessing.
[16]Weka. filters package. Available on: http://weka.sourceforge.net/doc.dev/weka/filters/Filter.html
[17]M.Tavallaee, E. Bagheri, W. Lu and A. A. Ghorbani, “A Detailed Analysis of the KDD CUP 99 Data Set”, Proceedings of the 2009 IEEE Symposium on Computational Intelligence in Security and Defence Applications, IEEE, 2009.
[18]KDD Cup 1999. Available on: http://kdd.ics.uci.edu/databases/kddcup99/kddcup99.html
[19]N.williams, S.Zander, G. Armitage, "A preliminary performance comparison of five machine learning algorithms for practical IP traffic flow classification”, SIGCOMM computer communication review, ACM, October 2006, pp. 7-15.
[20]M. Hall, E. Frank, “The WEKA data mining software: An update”, SIGKDDExplorations, Volume II, pp.10-18.