Threshold Controlled Binary Particle Swarm Optimization for High Dimensional Feature Selection

Full Text (PDF, 617KB), PP.75-84

Views: 0 Downloads: 0

Author(s)

Sonu Lal Gupta 1,* Anurag Singh Baghel 1 Asif Iqbal 2

1. Gautam Buddha University, Greater Noida, India

2. PIRO Technologies PVT. LTD., New Delhi, India

* Corresponding author.

DOI: https://doi.org/10.5815/ijisa.2018.08.07

Received: 10 Oct. 2017 / Revised: 10 Nov. 2017 / Accepted: 27 Nov. 2017 / Published: 8 Aug. 2018

Index Terms

Particle Swarm Optimization (PSO), Binary PSO (BPSO), Features Selection, Threshold Controlled BPSO (TC-BPSO), Dimensionality Reduction, Support Vector Machine (SVM)

Abstract

Dimensionality reduction or the optimal selection of features is a challenging task due to large search space. Currently, many research has been performed in this domain to improve the accuracy as well as to minimize the computational complexity. Particle Swarm Optimization (PSO) based feature selection approach seems very promising and has been extensively used for this work. In this paper, a Threshold Controlled Binary Particle Swarm Optimization (TC-BPSO) along with Multi-Class Support Vector Machine (MC-SVM) is proposed and compared with Conventional Binary Particle Swarm Optimization (C-BPSO). TC-BPSO is used for the selection of features while MC-SVM is used to calculate the classification accuracy. 70% of the data is used to train the MC-SVM model while the test has been performed on rest 30% data to calculate the accuracy. Proposed approach is tested on ten different datasets having varying difficulties such as some datasets having large number of features while some have small, some have just two classes while some have many classes, some datasets having small number of instances while some have large number of instances and the results obtained on these datasets are compared with some of the existing methods. Experiments show that the obtained results are very promising and achieved the best accuracy in minimum possible features. Proposed approach outperforms C-BPSO in all contexts on most of the datasets and 3-4 times computationally faster. It also outperforms in all context when compared with the existing work and 5-8 times computationally faster.

Cite This Paper

Sonu Lal Gupta, Anurag Singh Baghel, Asif Iqbal, "Threshold Controlled Binary Particle Swarm Optimization for High Dimensional Feature Selection", International Journal of Intelligent Systems and Applications(IJISA), Vol.10, No.8, pp.75-84, 2018. DOI:10.5815/ijisa.2018.08.07

Reference

[1]I. Guyon and A. Elisseeff, "An introduction to variable and feature Selection," Journal of machine learning research, 3, pp. 1157-1182, 2003.
[2]D.A.A.A Singh, E. J. Leavline, R. Priyanka, and P. P. Priya, "Dimensionality reduction using genetic algorithm for improving accuracy in medical diagnosis." International Journal of Intelligent Systems and Applications, 8 (1), pp. 67-73, 2016.
[3]R. Kohavi and G. H.John, "Wrappers for feature subset selection," Artificial intelligence, 97(1-2), pp. 273-324, 1997.
[4]A. L.Blum and P. Langley, "Selection of relevant features and examples in machine learning," Artificial intelligence, 97(1), pp. 245-271, 1997.
[5]J. Kennedy and R. Eberhart, "Particle Swarm Optimization," in IEEE International Conference on Neural Networks. Vol. 4, 1995.
[6]Y. Shi and R. Eberhart, "A modified particle swarm optimizer," in IEEE World Congress on Computational Intelligence, IEEE International Conference on, 1998., 1998.
[7]A. P. Engelbrecht, Computational Intelligence: An Introduction, John Wiley & Sons, 2007.
[8]R. Parimala and R. Nallaswamy, "Feature selection using a novel particle swarm optimization and It’s variants," Parimala, R., & Nallaswamy, R. (2012). Feature selection using a novel particle swarm opInternational Journal of Information Technology and Computer Science (IJITCS), 4(5), pp. 16-24, 2012.
[9]A. Khazaee, "Heart Beat Classification Using Particle Swarm Optimization," International Journal of Intelligent Systems and Applications, 5(6), pp. 25-33, 2013.
[10]B. Tran, B. Xue and M. Zhang, "Bare-Bone Particle Swarm Optimisation for Simultaneously Discretising and Selecting Features for High-Dimensional Classification," in European Conference on the Applications of Evolutionary Computation, pp. 701-718, 2016.
[11]B. Tran, B. Xue and M. Zhang, "A New Representation in PSO for Discretization-Based Feature Selection," IEEE Transactions on Cybernetics, PP(99), pp. 1-14,2017.
[12]E. Alba, J. Garcia-Nieto, L. Jourdan and E.-G. Talbi, "Gene selection in cancer classification using PSO/SVM and GA/SVM hybrid algorithms," in IEEE Congress on Evolutionary Computation, pp. 284-290, 2007.
[13]S. Fong, Y. Zhuang, R. Tang, X.-S. Yang and S. Deb, "Selecting Optimal Feature Set in High-Dimensional Data by Swarm Search," Journal of Applied Mathematics, vol. 2013, 18 pages, 2013.
[14]B. Tran, B. Xue and M. Zhang, "Improved PSO for Feature Selection on High-Dimensional Datasets," in Asia-Pacific Conference on Simulated Evolution and Learning, pp. 503-515, 2014.
[15]H. Banka and S. Dara, "A Hamming distance based binary particle swarm optimization (HDBPSO) algorithm for high dimensional feature selection, classification and validation," Pattern Recognition Letters 52, pp. 94-100, 2015.
[16]Doreswamy and M. U. Salma, "PSO based fast K-means algorithm for feature selection from high dimensional medical data set," in 10th International Conference on Intelligent Systems and Control (ISCO), pp. 1-6, 2016.
[17]T. M. Fahrudin, I. Syarif and A. R. Barakbah, "Ant colony algorithm for feature selection on microarray datasets," in International Electronics Symposium (IES), pp. 351-356, 2016.
[18]S. Gu, R. Cheng and Y. Jin, "Feature selection for high-dimensional classification using a competitive swarm optimizer," Soft Computing , pp. 1-12, 2016.
[19]B. Hu, Y. Dai, Y. Su, P. Moore, X. Zhang, C. Mao and J. Chen, "Feature selection for optimized high-dimensional biomedical data using the improved shuffled frog leaping algorithm," IEEE/ACM transactions on computational biology and bioinformatics, pp. 1-10, 2016.
[20]B. Tran, M. Zhang and B. Xue, "A PSO based hybrid feature selection algorithm for high-dimensional classification," in IEEE Congress on Evolutionary Computation (CEC), 2016.
[21]C. Cortes and V. Vapnik, "Support-Vector Networks," Machine Learning, volume 20, issue 3, pp. 273-297, 1995.
[22]V. Vapnik, Statistical Learning Theory, New York: John Wiley and Sons, 1998.
[23]"UCI Machine Learning Repository," [Online]. Available: http://archive.ics.uci.edu/ml/index.php. [Accessed 09 10 2017].
[24]"Gene Expression Model Selector," [Online]. Available: http://www.gems-system.org/. [Accessed 09 10 2017].
[25]B. Xue, M. C. Lane, I. Liu and M. Zhang, "Dimension Reduction in Classification using Particle Swarm Optimisation and Statistical Variable Grouping Information," IEEE Symposium Series on Computational Intelligence (SSCI), pp. 1-8, 2016.