Empirical Analysis of Bagged SVM Classifier for Data Mining Applications

Full Text (PDF, 317KB), PP.64-71

Views: 0 Downloads: 0

Author(s)

M.Govindarajan 1,*

1. Department of Computer Science and Engineering, Annamalai University, Annamalai Nagar – 608002, Tamil Nadu

* Corresponding author.

DOI: https://doi.org/10.5815/ijmecs.2013.10.08

Received: 15 Jul. 2013 / Revised: 12 Aug. 2013 / Accepted: 6 Sep. 2013 / Published: 8 Oct. 2013

Index Terms

Data Mining, Support Vector Machine, Intrusion Detection, Direct Marketing, Signature Verification, Classification Accuracy, Ensemble Method

Abstract

Data mining is the use of algorithms to extract the information and patterns derived by the knowledge discovery in databases process. Classification maps data into predefined groups or classes. It is often referred to as supervised learning because the classes are determined before examining the data. The feasibility and the benefits of the proposed approaches are demonstrated by the means of data mining applications like intrusion detection, direct marketing, and signature verification. A variety of techniques have been employed for analysis ranging from traditional statistical methods to data mining approaches. Bagging and boosting are two relatively new but popular methods for producing ensembles. In this work, bagging is evaluated on real and benchmark data sets of intrusion detection, direct marketing, and signature verification in conjunction with as the base learner. The proposed is superior to individual approach for data mining applications in terms of classification accuracy.

Cite This Paper

M.Govindarajan, "Empirical Analysis of Bagged SVM Classifier for Data Mining Applications", International Journal of Modern Education and Computer Science (IJMECS), vol.5, no.10, pp.64-71, 2013. DOI:10.5815/ijmecs.2013.10.08

Reference

[1]Aptéa, C. and Weiss, S. Data mining with decision trees and decision rules, Future Generation Computer Systems 13, n2-3, 1997, pp.197–210.
[2]J.P. Anderson. Computer security threat monitoring and surveillance, Technical Report, James P. Anderson Co., Fort Washington, PA, 1980.
[3]E. Biermann, E. Cloete and L.M. Venter. A comparison of intrusion detection Systems, Computer and Security, v( 20), 2001, pp. 676-683.
[4]Breiman, L. Stacked Regressions, Machine Learning, 24(1), 1996c, pp.49-64.
[5]Burges, C. J. C. A tutorial on support vector machines for pattern recognition, Data Mining and Knowledge Discovery, 2(2), 1998, pp.121-167.
[6]C. J. C. Burges and B. Scholkopf. Improving the Accuracy and Speed of Support vector Learning Machine, Advanced in Neural Information Processing Systems 9, MIT Press, Cambridge, MA, 1997, PP. 375-381.
[7]Cherkassky, V. and Mulier, F. Learning from Data - Concepts, Theory and Methods, John Wiley & Sons, New York, 1998.
[8]Cheung, K.-W., Kwok, J. K., Law, M. H., & Tsui, K.-C. Mining customer product rating for personalized marketing, Decision Support Systems, 35, 2003, pp. 231–243.
[9]Cortes, C. and Vapnik, V. Support Vector Networks, Machine Learning, 20, n3, 1995, pp.273–297.
[10]J. X. Dong, A. Krzyzak, and C.Y. Suen. Fast SVM Training Algorithm with Decomposition on Very Large Datasets, IEEE Trans. Pattern Analysis and Machine Intelligence, v(27), n 4, 2005, pp. 603-618.
[11]Heady R, Luger G, Maccabe A, Servilla M. The architecture of a network level intrusion detection system, Technical Report, Department of Computer Science, University of New Mexico, 1990.
[12]T.K.Ho, J.J.Hull, and S.N.Srihari. Combination of Structural Classifiers, Proc. IAPR Workshop Syntatic and Structural Pattern Recog, 1990, pp. 123-137.
[13]Hu, X. A data mining approach for retailing bank customer attrition analysis, Applied Intelligence 22(1), 2005, pp.47-60.
[14]K. Ilgun, R.A. Kemmerer and P.A. Porras. State transition analysis:A rule-based intrusion detection approach, IEEE Trans. Software Eng. V(21),1995, pp. 181-199.
[15]Ira Cohen, Qi Tian, Xiang Sean Zhou and Thoms S.Huang. Feature Selection Using Principal Feature Analysis, Proceedings of the 15th international conference on Multimedia, Augsburg, Germany, September, 2007, pp. 25-29.
[16]Jiawei Han, Micheline Kamber. Data Mining – Concepts and Techniques, Elsevier Publications, 2003.
[17]U. Krebel. Pairwise Classification and Support Vector Machines, Advances in Kernel Methods: Support Vector Learning, MIT Press, Cambridge, MA, 1999, pp. 255-268.
[18]Kohavi, R. A study of cross-validation and bootstrap for accuracy estimation and model selection, Proceedings of International Joint Conference on Artificial Intelligence, 1995, pp.1137–1143.
[19]Li, W., Wu, X., Sun, Y. and Zhang, Q. Credit Card Customer Segmentation and Target Marketing Based on Data Mining, Proceedings of International Conference on Computational Intelligence and Security, 2010, pp.73-76.
[20]Ling, X. and Li, C. Data Mining for Direct Marketing: Problems and Solutions. Proceedings of the 4th KDD conference, AAAI Press, 1998, pp.73–79.
[21]E. Lundin and E. Jonsson. Anomaly-based intrusion detection: privacy concerns and other problems, Computer Networks, v(34), 2002, pp. 623-640.
[22]D. Marchette. A statistical method for profiling network traffic, proceedings of the First USENIX Workshop on Intrusion Detection and Network Monitoring (Santa Clara), CA, 1999, pp. 119-128.
[23]Mukkamala S, Sung AH, Abraham A. Intrusion detection using ensemble of soft computing paradigms, third international conference on intelligent systems design and applications, intelligent systems design and applications, advances in soft computing. Germany: Springer, 2003, pp. 239–48.
[24]Mukkamala S, Sung AH, Abraham A. Modeling intrusion detection systems using linear genetic programming approach, The 17th international conference on industrial & engineering applications of artificial intelligence and expert systems, innovations in applied artificial intelligence. In: Robert O., Chunsheng Y., Moonis A., editors. Lecture Notes in Computer Science, vol. 3029. Germany: Springer, 2004a, pp. 633–42.
[25]Mukkamala S, Sung AH, Abraham A, Ramos V. Intrusion detection systems using adaptive regression splines. In: Seruca I, Filipe J, Hammoudi S, Cordeiro J, editors. Proceedings of the 6th international conference on enterprise information systems, ICEIS’04, vol. 3, Portugal, 2004b, pp. 26–33.
[26]S. Mukkamala, G. Janoski and A.Sung. Intrusion detection: support vector machines and neural networks" In proceedings of the IEEE International Joint Conference on Neural Networks (ANNIE), St. Louis, MO, 2002, pp. 1702-1707.
[27]Schapire, R., Freund, Y., Bartlett, P., and Lee, W. (1997). Boosting the margin: A new explanation for the effectives of voting methods. In proceedings of the fourteenth International Conference on Machine Learning, Nashville, TN, 1997, pp. 322-330.
[28]Shah K, Dave N, Chavan S, Mukherjee S, Abraham A, Sanyal S. (2004), Adaptive neuro-fuzzy intrusion detection system. IEEE International Conference on Information Technology: Coding and Computing (ITCC’04), v(1), USA: IEEE Computer Society, 2004, pp. 70–74.
[29]Shin, H., Cho, S. "Response Modeling with Support vector Machines", Expert Systems with Applications, 30, 2006, pp. 746-760.
[30]T. Shon and J. Moon. A hybrid machine learning approach to network anomaly detection", Information Sciences, v(177), 2007, pp. 3799-3821.
[31]C.Y.Suen, C.Nadal, T.A.Mai, R.Legault, and L.Lam. Recognition of totally unconstrained handwritten numerals based on the concept of multiple experts, Frontiers in Handwriting Recognition , C.Y.Suen, Ed., IN Proc.Int.Workshop on Frontiers in Handwriting Recognition, Montreal, Canada, Apr. 2-3, 1990, pp. 131-143.
[32]C. Y. Suen, C. Nadal, R. Legault, T. A. Mai, and L. Lam. Computer recognition of unconstrained handwritten numerals, Proc. IEEE, v(80), 1992, pp. 1162–1180.
[33]Summers RC. Secure computing: threats and safeguards. New York, McGraw-Hill, 1997.
[34]Sundaram A. An introduction to intrusion detection. ACM Cross Roads, 2(4), 1996.
[35]W. Stallings. Cryptography and network security principles and practices, USA: Prentice Hall, 2006
[36]C. Tsai , Y. Hsu, C. Lin and W. Lin. Intrusion detection by machine learning: A review, Expert Systems with Applications, v(36), 2009, pp.11994-12000.
[37]T. Verwoerd and R. Hunt. Intrusion detection techniques and approaches, Computer Communications, v(25), 2002, pp.1356-1365.
[38]Vanajakshi, L. and Rilett, L.R. A Comparison of the Performance of Artificial Neural Network and Support Vector Machines for the Prediction of Traffic Speed, IEEE Intelligent Vehicles Symposium, University of Parma, Parma, Italy, IEEE, 2004, pp.194-199.
[39]Viaene, S., Baesens, B., Van Gestel, T., Suykens, J. A. K., Van den Poel, D., Vanthienen, J., et al. Knowledge discovery in a direct marketing case using least squares support vector machines, International Journal of Intelligent Systems, 16, 2001b, pp.1023–1036.
[40]Vapnik, V. (1998). Statistical learning theory, New York, John Wiley & Sons, 1998.
[41]S. Wu and W. Banzhaf. The use of computational intelligence in intrusion detection systems: A review, Applied Soft Computing, v(10), 2010, pp. 1-35.
[42]Zhang, H. The Optimality of Naïve Bayes, Proceedings of the 17th FLAIRS conference, AAAI Press, 2004.