A Hybrid RBF-SVM Ensemble Approach for Data Mining Applications

Full Text (PDF, 440KB), PP.84-95

Views: 0 Downloads: 0

Author(s)

M.Govindarajan 1,*

1. Department of Computer Science and Engineering, Annamalai University, Annamalai Nagar – 608002, Tamil Nadu, India

* Corresponding author.

DOI: https://doi.org/10.5815/ijisa.2014.03.09

Received: 5 Jul. 2013 / Revised: 5 Oct. 2013 / Accepted: 11 Dec. 2013 / Published: 8 Feb. 2014

Index Terms

Machine learning, Radial Basis Function, Support Vector Machine, Intrusion Detection, Direct Marketing, Signature Verification, Ensemble, Classification Accuracy

Abstract

One of the major developments in machine learning in the past decade is the ensemble method, which finds highly accurate classifier by combining many moderately accurate component classifiers. This paper addresses using an ensemble of classification methods for data mining applications like intrusion detection, direct marketing, and signature verification. In this research work, new hybrid classification method is proposed for heterogeneous ensemble classifiers using arcing and their performances are analyzed in terms of accuracy. A Classifier ensemble is designed using a Radial Basis Function (RBF) and Support Vector Machine (SVM) as base classifiers. Here, modified training sets are formed by resampling from original training set; classifiers constructed using these training sets and then combined by voting. The proposed RBF-SVM hybrid system is superior to individual approach for intrusion detection, direct marketing, and signature verification in terms of classification accuracy.

Cite This Paper

M.Govindarajan, "A Hybrid RBF-SVM Ensemble Approach for Data Mining Applications", International Journal of Intelligent Systems and Applications(IJISA), vol.6, no.3, pp.84-95, 2014. DOI:10.5815/ijisa.2014.03.09

Reference

[1]Aristides Gionis and Heikki Mannila and Panayiotis Tsaparas. Clustering Aggregation,ICDE, 2005. 

[2]A. Amin, H. B. Al-Sadoun, and S. Fischer. Hand-printed Arabic Character Recognition System Using An Artificial Network, Pattern Recognition, 29( 4), 1996, pp. 663-675. 

[3]J.P. Anderson, Computer security threat monitoring and surveillance, Technical Report, James P. Anderson Co., Fort Washington, PA, 1980. 

[4]C. L. Bauer. A direct mail customer purchase model, Journal of Direct Marketing, 2, 1998, pp.16–24.

[5]E. Biermann, E. Cloete and L.M. Venter. A comparison of intrusion detection Systems, Computer and Security, 20, 2001,pp.676-683.

[6]Breiman. L. Bias, Variance, and Arcing Classifiers, Technical Report 460, Department of Statistics, University of California, Berkeley, CA, 1996. 

[7]C. J. C. Burges and B. Scholkopf. Improving the Accuracy and Speed of Support vector Learning Machine, Advanced in Neural Information Processing Systems 9, MIT Press, Cambridge, MA, 1997, pp. 375-381.

[8]J. Cai, M. Ahmadi, and M. Shridhar. Recognition of Handwritten Numerals with Multiple Feature and Multi-stage Classifier, Pattern Recognition, 28(2), 1995, pp. 153-160.

[9]Fader, P. S., B. G. S. Hardie, and K. L. Lee. Counting Your Customers’ the Easy Way: An Alternative to the Pareto/NBD Model, Working Paper, Wharton Marketing Department, 2004. 

[10]Freund, Y. and Schapire, R. A decision-theoretic generalization of on-line learning and an application to boosting. In proceedings of the Second European Conference on Computational Learning Theory, 1995, pp. 23-37.

[11]Freund, Y. and Schapire, R. Experiments with a new boosting algorithm, In Proceedings of the Thirteenth International Conference on Machine Learning, 1996, pp.148-156 Bari, Italy.

[12]Ghosh AK, Schwartzbard A. A study in using neural networks for anomaly and misuse detection. In: The proceeding on the 8th USENIX security symposium, <http://citeseer.ist.psu.edu/context/1170861/0>; 1999, [accessed August 2006].

[13]Gonul, F. F., Kim, B. D., & Shi, M. Mailing smarter to catalog customers. Journal of Interactive Marketing, 14(2), 2000, pp.2–16.

[14]M.Govindarajan, RM.Chandrasekaran. Intrusion Detection using an Ensemble of Classification Methods, In Proceedings of International Conference on Machine Learning and Data Analysis, San Francisco, U.S.A, 2012, pp.459-464. 

[15]Gupta, Sunil, Donald R. Lehmann, and Jennifer Ames Stuart. Valuing Customers,” Journal of Marketing Research, 41(1), 2004, pp.7–18.

[16]Hastie, T., R. Tibshirani, and J. Friedman. The Elements of Statistical Learning: Data Mining, Inference and Prediction. New York: Springer-Verlag, 2001

[17]Heady R, Luger G, Maccabe A, Servilla M. The architecture of a network level intrusion detection system. Technical Report, Department of Computer Science, University of New Mexico, 1990. 

[18]T.K.Ho, J.J.Hull, and S.N.Srihari, Combination of Structural Classifiers, in Proc. IAPR Workshop Syntatic and Structural Pattern Recog., 1990, pp.123-137. 

[19]Y. S. Huang and C. Y. Suen. An Optimal Method of Combining Multiple Classifiers for Unconstrained Handwritten Numeral Recognition, Proceedings of 3rd International Workshop on Frontiers in Handwriting Recognition, 1993. 

[20]Y. S. Huang and C. Y. Suen. A Method of Combining Experts for the Recognition of Unconstrained Handwritten Numerals, IEEE Transactions on PAMI, 17(1), 1995, pp.90-94.

[21]K. Ilgun, R.A. Kemmerer and P.A. Porras. State transition analysis:A rule-based intrusion detection approach, IEEE Trans. Software Eng, 21, 1995, pp.181-199.

[22]Ira Cohen, Qi Tian, Xiang Sean Zhou and Thoms S.Huang. Feature Selection Using Principal Feature Analysis, In Proceedings of the 15th international conference on Multimedia, Augsburg, Germany, September, 2007, pp.25-29. 

[23]Jiawei Han , Micheline Kamber. Data Mining – Concepts and Techniques, Elsevier Publications, 2003

[24]KDD'99 dataset, http://kdd.ics.uci.edu/databases, Irvine, CA, USA, 2010. 

[25]U. Krebel. Pairwise Classification and Support Vector Machines, Advances in Kernel Methods: Support Vector Learning, MIT Press, Cambridge, MA, 1999, pp.255-268.

[26]Kubat, M., Holte, R., & Matwin, S. Learning when negative examples abound. Lecture Notes in Artificial Intelligence (LNAI 1224), 1997, pp.146–153, Prague, The Czech Republic

[27]L. Lam and C. Y. Suen. Optimal Combinations of Pattern Classifiers, Pattern Recognition Letters, 16(9), 1995, pp.945-954.

[28]Lemmens, Aur´elie and Christophe Croux. Bagging and Boosting Classification Trees to Predict Churn, Working Paper, Teradata center, 2003. 

[29]Ling, C. X., & Li, C. Data mining for direct marketing: Problems and solutions, Proceedings of the KDD98, 1998, pp.73–79. 

[30]E. Lundin and E. Jonsson. Anomaly-based intrusion detection: privacy concerns and other problems", Computer Networks,34, 2002,pp.623-640. 

[31]Maryam Daneshmandi, Marzieh Ahmadzadeh. A Hybrid Data Mining Model to Improve Customer Response Modeling in Direct Marketing, Indian Journal of Computer Science and Engineering, 3(6), 2013,pp.844-855. 

[32]D. Marchette. A statistical method for profiling network traffic". In proceedings of the First USENIX Workshop on Intrusion Detection and Network Monitoring (Santa Clara), CA, 1999,pp.119-128. 

[33]Michie, D., Spiegelhalter, D. J., & Taylor, C. Machine learning. Neural and statistical classification, Ellis Horwood, 1994. 

[34]Mukkamala S, Sung AH, Abraham A. Intrusion detection using ensemble of soft computing paradigms, third international conference on intelligent systems design and applications, intelligent systems design and applications, advances in soft computing. Germany: Springer; 2003, pp.239–48.

[35]Mukkamala S, Sung AH, Abraham A. Modeling intrusion detection systems using linear genetic programming approach, The 17th international conference on industrial & engineering applications of artificial intelligence and expert systems, innovations in applied artificial intelligence. In: Robert O., Chunsheng Y., Moonis A., editors. Lecture Notes in Computer Science, vol. 3029. Germany: Springer; 2004a, pp.633–42.

[36]Mukkamala S, Sung AH, Abraham A, Ramos V. (2004b), Intrusion detection systems using adaptive regression splines. In: Seruca I, Filipe J, Hammoudi S, Cordeiro J, editors. Proceedings of the 6th international conference on enterprise information systems, ICEIS’04, vol. 3, Portugal, 2004b,pp.26–33 

[37]S. Mukkamala, G. Janoski and A.Sung. Intrusion detection: support vector machines and neural networks, In proceedings of the IEEE International Joint Conference on Neural Networks (ANNIE), St. Louis, MO, 2002, pp.1702-1707.

[38]Oliver Buchtala, Manuel Klimek, and Bernhard Sick, Member, IEEE. Evolutionary Optimization of Radial Basis Function Classifiers for Data Mining Applications, IEEE Transactions on systems, man, and cybernetics—part b: cybernetics, 35(5), 2005. 

[39]Sara Madeira Joao M.Sousa. Comparison of target selection methods in direct Marketing, Technical University of Lisbon, Institution Superior Technician, Dept. Mechanical Eng./IDMEC, 1049-001 Lisbon, Portugal, 2000. 

[40]Setnes, M., & Kaymak, U. Fuzzy modeling of client preference from large data sets: an application to target selection in direct marketing. IEEE Transactions on Fuzzy Systems, 9(1), 2001, pp.153–163.

[41]Shah K, Dave N, Chavan S, Mukherjee S, Abraham A, Sanyal S. Adaptive neuro-fuzzy intrusion detection system, IEEE International Conference on Information Technology: Coding and Computing (ITCC’04), 1, USA: IEEE Computer Society; 2004, pp.70–74.

[42]Shin, H. J., & Cho, S. Response modeling with support vector machines. Expert Systems with Applications, 30(4), 2006, pp.746–760.

[43]T. Shon and J. Moon. A hybrid machine learning approach to network anomaly detection, Information Sciences, 177, 2007, pp.3799-3821.

[44]C.Y.Suen, C.Nadal, T.A.Mai, R.Legault, and L.Lam, Recognition of totally unconstrained handwritten numerals based on the concept of multiple experts, Frontiers in Handwriting Recognition , C.Y.Suen, Ed., IN Proc.Int.Workshop on Frontiers in Handwriting Recognition, Montreal, Canada, Apr. 2-3, 1990, pp.131-143. 

[45]C. Y. Suen, C. Nadal, R. Legault, T. A. Mai, and L. Lam. Computer recognition of unconstrained handwritten numerals, Proc. IEEE, 80, 1992, pp.1162–1180. 

[46]Summers RC. Secure computing: threats and safeguards. New York: McGraw-Hill, 1997. 

[47]Sundaram A. An introduction to intrusion detection. ACM Cross Roads; 2(4), 1996. 

[48]W. Stallings. Cryptography and network security principles and practices, USA: Prentice Hall, 2006

[49]Tang, Z. Improving Direct Marketing Profitability with Neural Networks. International Journal of Computer Applications, 29(5), 2011,pp.13-18.

[50]C. Tsai , Y. Hsu, C. Lin and W. Lin. Intrusion detection by machine learning: A review, Expert Systems with Applications, 36, 2009,pp.11994-12000.

[51]Vapnik, V. Statistical learning theory, New York, John Wiley & Sons, 1998. 

[52]T. Verwoerd and R. Hunt. Intrusion detection techniques and approaches, Computer Communications, 25, 2002,pp.1356-1365.

[53]S. Wu and W. Banzhaf. The use of computational intelligence in intrusion detection systems: A review, Applied Soft Computing, 10, 2010, pp.1-35. 

[54]L. Xu, A. Krzyzak, and C. Y. Suen. Methods of Combining Multiple Classifiers and Their Applications to Handwritten Recognition, IEEE Transactions on Systems, Man, Cybernetics, 22(3), 1992, pp.418-435.

[55]Yu, E., & Cho, S. Constructing response model using ensemble based on feature subset selection. Expert Systems with Applications, 30(2), 2006, pp.352–360.

[56]Zahavi, J., & Levin, N. Issues and problems in applying neural computing to target marketing. Journal of Direct Marketing, 11(4), 1997, pp.63–75.