Email Spam Detection Using Combination of Particle Swarm Optimization and Artificial Neural Network and Support Vector Machine

Full Text (PDF, 518KB), PP.68-74

Views: 0 Downloads: 0

Author(s)

Mohammad Zavvar 1,* Meysam Rezaei 2 Shole Garavand 2

1. Sama technical and vocational training college, Islamic Azad University, Gorgan Branch, Gorgan, Iran

2. Department of Computer Engineering, Sari Branch, Islamic Azad University, Sari, Iran

* Corresponding author.

DOI: https://doi.org/10.5815/ijmecs.2016.07.08

Received: 5 Mar. 2016 / Revised: 22 Apr. 2016 / Accepted: 23 May 2016 / Published: 8 Jul. 2016

Index Terms

Particle swarm optimization, artificial neural network, support vector machine, email, spam, classifying

Abstract

The increasing use of e-mail in the world because of its simplicity and low cost, has led many Internet users are interested in developing their work in the context of the Internet. In the meantime, many of the natural or legal persons, to sending e-mails unrelated to mass. Hence, classification and identification of spam emails is very important. In this paper, the combined Particle Swarm Optimization algorithms and Artificial Neural Network for feature selection and Support Vector Machine to classify and separate spam used have and finally, we compared the proposed method with other methods such as data classification Self Organizing Map and K-Means based on criteria Area Under Curve. The results indicate that the Area Under Curve in the proposed method is better than other methods.

Cite This Paper

Mohammad Zavvar, Meysam Rezaei, Shole Garavand, "Email Spam Detection Using Combination of Particle Swarm Optimization and Artificial Neural Network and Support Vector Machine", International Journal of Modern Education and Computer Science(IJMECS), Vol.8, No.7, pp.68-74, 2016. DOI:10.5815/ijmecs.2016.07.08

Reference

[1]Sebastiani, F., Machine learning in automated text categorization. ACM computing surveys (CSUR), 2002. 34(1): p. 1-47.
[2]A. Kołcz, J.A., SVM-based filtering of email spam with content-specific misclassification costs. Proc. of TextDM’01 Workshop on Text Mining, 2001.
[3]Androutsopoulos, I., et al., Learning to filter spam e-mail: A comparison of a naive bayesian and a memory-based approach. arXiv preprint cs/0009009, 2000.
[4]Androutsopoulos, I., et al., An evaluation of naive bayesian anti-spam filtering. arXiv preprint cs/0006013, 2000.
[5]Androutsopoulos, I., et al. An experimental comparison of naive Bayesian and keyword-based anti-spam filtering with personal e-mail messages. in Proceedings of the 23rd annual international ACM SIGIR conference on Research and development in information retrieval. 2000. ACM.
[6]Drucker, H., D. Wu, and V.N. Vapnik, Support vector machines for spam categorization. Neural Networks, IEEE Transactions on, 1999. 10(5): p. 1048-1054.
[7]Sakkis, G., et al., Stacking classifiers for anti-spam filtering of e-mail. arXiv preprint cs/0106040, 2001.
[8]Sahami, M., et al. A Bayesian approach to filtering junk e-mail. in Learning for Text Categorization: Papers from the 1998 workshop. 1998.
[9]Woitaszek, M., M. Shaaban, and R. Czernikowski. Identifying junk electronic mail in Microsoft outlook with a support vector machine. in null. 2003. IEEE.
[10]Trudgian, D.C., Spam classification using nearest neighbour techniques, in Intelligent Data Engineering and Automated Learning–IDEAL 2004. 2004, Springer. p. 578-585.
[11]Koprinska, I., et al., Learning to classify e-mail. Information Sciences, 2007. 177(10): p. 2167-2187.
[12]Kennedy, J., Eberhart, R., Particle Swarm Optimization. Proceeding of International Conference on Neural Networks. Perth, Australia, 1995 IEEE, Piscataway, 1995,pp. 1942-1948., 1995.
[13]Jha, M.K., P. Schonfeld, and J.-C. Jong, Intelligent road design. Vol. 19. 2006: WIT Press.
[14]14.Kennedy, J., et al., Swarm intelligence. 2001: Morgan Kaufmann.
[15]Gupta, N., Artificial neural network. Network and Complex Systems, 2013. 3(1): p. 24-28.
[16]Zavvar, Mohammad, and Farhad Ramezani. "Measuring of Software Maintainability Using Adaptive Fuzzy Neural Network." International Journal of Modern Education & Computer Science 7.10 (2015).
[17]Yetilmezsoy, K. and S. Demirel, Artificial neural network (ANN) approach for modeling of Pb (II) adsorption from aqueous solution by Antep pistachio (Pistacia Vera L.) shells. Journal of Hazardous Materials, 2008. 153(3): p. 1288-1300.
[18]Vapnick, V.N., The Nature of Statistical Learning Theory. Second Edition, Springer-Verlag New York Inc, 2000.
[19]Haykin, S., Neural Networks: A Comprehensive Foundation. Second Edition, Prentice-Hall Inc, 1999.
[20]Burges, C.J., A tutorial on support vector machines for pattern recognition. Data mining and knowledge discovery, 1998. 2(2): p. 121-167.
[21]Melin, P., et al., A new approach for time series prediction using ensembles of ANFIS models. Expert Systems with Applications, 2012. 39(3): p. 3494-3506.
[22]Ass, K. and L. Eikvil, Text categorization: A survey. 1999, Technical Report. Norwegian Computing Center.
[23]Debole, F. and F. Sebastiani, Supervised term weighting for automated text categorization, in Text mining and its applications. 2004, Springer. p. 81-97.
[24]Bron, E., et al., Feature selection based on SVM significance maps for classification of dementia, in Machine Learning in Medical Imaging. 2014, Springer. p. 272-279.
[25]Lichma, M., Machine Learning Repository. http://archive.ics.uci.edu/ml, 2013.
[26]Davis, J. and M. Goadrich. The relationship between Precision-Recall and ROC curves. in Proceedings of the 23rd international conference on Machine learning. 2006. ACM.
[27]Fawcett, T., An introduction to ROC analysis. Pattern recognition letters, 2006. 27(8): p. 861-874.