An Optimization of Feature Selection for Classification using Modified Bat Algorithm

Full Text (PDF, 385KB), PP.38-46

Views: 0 Downloads: 0

Author(s)

V. Yasaswini 1,* Santhi Baskaran 2

1. Computer Science and Engineering Department, Pondicherry Engineering College, Puducherry, India

2. Information Technology Department, Pondicherry Engineering College, Puducherry, India

* Corresponding author.

DOI: https://doi.org/10.5815/ijitcs.2021.04.04

Received: 5 Feb. 2021 / Revised: 3 Apr. 2021 / Accepted: 22 Apr. 2021 / Published: 8 Aug. 2021

Index Terms

Optimization, Meta-heuristic, Feature Extraction, Deep learning, Firefly, Cuckoo, Harmony, Bat Algorithm

Abstract

Data mining is the action of searching the large existing database in order to get new and best information. It plays a major and vital role now-a-days in all sorts of fields like Medical, Engineering, Banking, Education and Fraud detection. In this paper Feature selection which is a part of Data mining is performed to do classification. The role of feature selection is in the context of deep learning and how it is related to feature engineering. Feature selection is a preprocessing technique which selects the appropriate features from the data set to get the accurate result and outcome for the classification. Nature-inspired Optimization algorithms like Ant colony, Firefly, Cuckoo Search and Harmony Search showed better performance by giving the best accuracy rate with less number of features selected and also fine f-Measure value is noted. These algorithms are used to perform classification that accurately predicts the target class for each case in the data set. We propose a technique to get the optimized feature selection to perform classification using Meta Heuristic algorithms. We applied new and recent advanced optimized algorithm named Modified Bat algorithm on University of California Irvine datasets that showed comparatively equal results with best performed existing firefly but with less number of features selected. The work is implemented using JAVA and the Medical dataset has been used. These datasets were chosen due to nominal class features. The number of attributes, instances and classes varies from chosen dataset to represent different combinations. Classification is done using J48 classifier in WEKA tool. We demonstrate the comparative results of the presently used algorithms with the existing algorithms thoroughly. The significance of this research is it will show a great impact in selecting the best features out of all the existing features which gives best accuracy rates which helps in extracting the information from raw data in Data Mining Domain. The Value of this research is it will manage main fields like medical and banking which gives exact and proper results in their respective field. The best quality of the research is to optimize the selection of features to achieve maximum predictive accuracy of the data sets which solves both single variable and multi-variable functions through the generation of binary structuring of features in the dataset and to increase the performance of classification by using nature inspired and Meta Heuristic algorithms.

Cite This Paper

V. Yasaswini, Santhi Baskaran, "An Optimization of Feature Selection for Classification using Modified Bat Algorithm", International Journal of Information Technology and Computer Science(IJITCS), Vol.13, No.4, pp.38-46, 2021. DOI:10.5815/ijitcs.2021.04.04

Reference

[1] Tan, Steinbach, Kumar. (2005). “Introduction to Data Mining”.
[2] Hassan AbouEisha et.al, (2018) “Extensions of Dynamic Programming for Combinatorial Optimization and Data Mining”
[3] Sunil Kawale, “Datamining and Optimization Techniques” International Journal of Statistika and Mathematika”, Volume 6, Issue 2, 2013 pp 70-72.
[4] Nidhi Tomar and Prof. Amit Kumar Manjhvar, “A Survey on Data mining optimization Techniques”, International Journal of Science Technology & Engineering, Volume 2, Issue 06, December 2015.
[5] Basturk B, Karaboga D (2006) “An artificial bee colony (ABC) algorithm for numeric function optimization”. IEEE Swarm Intelligence Symposium, 12–14 May, Indianapolis.
[6] Bergh F, Engelbrecht AP (2006) “A study of particle swarm optimization particle trajectories”. Inf Sci 176. 937–971.
[7] Rao, R. Venkata. “Teaching Learning Based Optimization Algorithm. And Its Engineering Applications”. Springer, 2015.
[8] Rao, R. Venkata, and V. D. Kalyankar. "Parameter optimization of modern machining processes using teaching–learning-based optimization algorithm”. Engineering Applications of Artificia Intelligence 26, no. 1 (2013). 524-531.
[9] Shunmugapriya .P and Kanmani S, P.Sindhuja, G.Koperundevi, V.Yasaswini, “Firefly Algorithm Approach for the Optimization of Feature Selection to Perform Classification”, International Conference on Advances in Engineering & Technology, IEEE-ICAET 2014.
[10] Xin-She Yang, Suash Deb, “Cuckoo Search Via Levy Flights”, World Congress On Nature and Biologically Inspired Computing (NaBIC 2009) .
[11] Xin-She Yang and X. He. “Bat algorithm: Literature review and applications”. International Journal of Bio-Inspires Computation, 5(3):141-149, 2013.
[12] Richardson, P.: The secret life of bats. http://www.nhm.ac.
[13] B.Kalpana, Dr.V.Saravanan and Dr.K. Vivekanandan. “A survey of feature Selection models on Classification”, Vol 3, No.1, Jan-Feb. 2012.
[14] T.Sai Durga, V.Yasaswini. “An Enhancement for the optimization of feature selection to perform Classification Using Meta Heuristic Algorithms”, In International Journal of Latest Engineering Research and Applications (IJLERA), Volume – 01, Issue – 09, December – 2016, PP – 64-70.
[15] Thair Nu Phyu. “Survey of Classification Techniques in Data Mining”, International Multi Conference of Engineers and Computer Scientists 2009 Vol I IMECS 2009, March 18 - 20, 2009, Hong Kong.
[16] Samina Khalid, Tehmina Khalil, Shamila Nasreen. “A Survey of Feature Selection and Feature Extraction Techniques in Machine Learning”, Science and Information Conference 2014, August 27-29, 2014.
[17] Huan liu and lei yu. “Toward integrating Feature Selection algorithms for Classification and Clustering”. IEEE transactions on Knowledge and Data Engineering, vol 17. No 4, April 2005.
[18] Lakshmi, T. M., Martin, A., Begum, R. M., & Venkatesan, V. P. (2013). An analysis on performance of decision tree algorithms using student's qualitative data. International Journal of Modern Education and Computer Science, 5(5), 18.
[19] Ogunde A. O and Ajibade D. A.," A Data Mining System for Predicting University Students' Graduation Grades Using ID3 Decision Tree Algorithm ". Journal of Computer Science and Information Technology
[20] Shivam Goyal, Jaskirat Singh, "Two-Level Alloyed Branch Predictor based on Genetic Algorithm for Deep Pipelining Processors", International Journal of Modern Education and Computer Science, Vol.9, No.5, pp.27-33, 2017.
[21] Zahid Ullah, Muhammad Fayaz, Asif Iqbal, "Critical Analysis of Data Mining Techniques on Medical Data", International Journal of Modern Education and Computer Science, Vol.8, No.2, pp.42-48, 2016.
[22] N. Shamli, B. Sathiyabhama, "Parkinson's Brain Disease Prediction Using Big Data Analytics", International Journal of Information Technology and Computer Science, Vol.8, No.6, pp.73-84, 2016.