Comparative Analysis of KNN Algorithm using Various Normalization Techniques

Full Text (PDF, 416KB), PP.36-42

Views: 0 Downloads: 0

Author(s)

Amit Pandey 1,* Achin Jain 1

1. Bharati Vidyapeeth’s College of Engineering, Information Technology, New Delhi, 110063, India

* Corresponding author.

DOI: https://doi.org/10.5815/ijcnis.2017.11.04

Received: 19 Jun. 2017 / Revised: 20 Jul. 2017 / Accepted: 10 Aug. 2017 / Published: 8 Nov. 2017

Index Terms

KNN, Classification, Normalization, Z-Score Normalization, Min-Max Normalization, Cross Validation Method

Abstract

Classification is the technique of identifying and assigning individual quantities to a group or a set. In pattern recognition, K-Nearest Neighbors algorithm is a non-parametric method for classification and regression. The K-Nearest Neighbor (kNN) technique has been widely used in data mining and machine learning because it is simple yet very useful with distinguished performance. Classification is used to predict the labels of test data points after training sample data. Over the past few decades, researchers have proposed many classification methods, but still, KNN (K-Nearest Neighbor) is one of the most popular methods to classify the data set. The input consists of k closest examples in each space, the neighbors are picked up from a set of objects or objects having same properties or value, this can be considered as a training dataset. In this paper, we have used two normalization techniques to classify the IRIS Dataset and measure the accuracy of classification using Cross-Validation method using R-Programming. The two approaches considered in this paper are - Data with Z-Score Normalization and Data with Min-Max Normalization.

Cite This Paper

Amit Pandey, Achin Jain, "Comparative Analysis of KNN Algorithm using Various Normalization Techniques", International Journal of Computer Network and Information Security(IJCNIS), Vol.9, No.11, pp.36-42, 2017. DOI:10.5815/ijcnis.2017.11.04

Reference

[1]Jagadish, Hosagrahar V., et al. "iDistance: An adaptive B+-tree based indexing method for nearest neighbor search." ACM Transactions on Database Systems (TODS) 30.2 (2005): 364-397.
[2]Amato, Giuseppe, Fabrizio Falchi, and Claudio Gennaro. "Fast image classification for monument recognition." Journal on Computing and Cultural Heritage (JOCCH) 8.4 (2015): 18.
[3]Stamoulias, Ioannis, and Elias S. Manolakos. "Parallel architectures for the kNN classifier--design of soft IP cores and FPGA implementations." ACM Transactions on Embedded Computing Systems (TECS) 13.2 (2013): 22.
[4]Liu, Xiaohua, et al. "Named entity recognition for tweets." ACM Transactions on Intelligent Systems and Technology (TIST) 4.1 (2013): 3.
[5]Averbuch-Elor, Hadar, and Daniel Cohen-Or. "RingIt: Ring-Ordering Casual Photos of a Temporal Event." ACM Trans. Graph. 34.3 (2015): 33.
[6]Lai, Yu-Sheng, and Chung-Hsien Wu. "Meaningful term extraction and discriminative term selection in text categorization via unknown-word methodology." ACM Transactions on Asian Language Information Processing (TALIP) 1.1 (2002): 34-64.
[7]Tang, Jinhui, et al. "Image annotation by k nn-sparse graph-based label propagation over noisily tagged web images." ACM Transactions on Intelligent Systems and Technology (TIST) 2.2 (2011): 14.
[8]Baoli, Li, Lu Qin, and Yu Shiwen. "An adaptive k-nearest neighbor text categorization strategy." ACM Transactions on Asian Language Information Processing (TALIP) 3.4 (2004): 215-226.
[9]M. Montague and J. A. Aslam. Relevance score normalization for metasearch. In Proceedings of the ACM CIKM, pages 427–433. ACM, 2001.
[10]Forman, George, Martin Scholz, and Shyamsundar Rajaram. "Feature shaping for linear SVM classifiers." Proceedings of the 15th ACM SIGKDD international conference on Knowledge discovery and data mining. ACM, 2009.
[11]Huang, Ke-Wei, and Zhuolun Li. "A multilabel text classification algorithm for labeling risk factors in SEC form 10-K." ACM Transactions on Management Information Systems (TMIS) 2.3 (2011): 18.
[12]Bijalwan, Vishwanath, et al. "KNN based machine learning approach for text and document mining." International Journal of Database Theory and Application 7.1 (2014): 61-70.
[13]Chinchuluun, Radnaabazar, et al. "Clustering and classification algorithms in food and agricultural applications: a survey." Advances in modeling agricultural systems. Springer US, 2009. 433-454.
[14]Reese, Heather, et al. "Applications using estimates of forest parameters derived from satellite and forest inventory data." Computers and Electronics in Agriculture 37.1 (2002): 37-55.s
[15]Chen, Hui-Ling, et al. "A novel bankruptcy prediction model based on an adaptive fuzzy k-nearest neighbor method." Knowledge-Based Systems 24.8 (2011): 1348-1359.
[16]Tam, Kar Yan, and Melody Y. Kiang. "Managerial applications of neural networks: the case of bank failure predictions." Management science 38.7 (1992): 926-947.
[17]Ramana, Bendi Venkata, M. Surendra Prasad Babu, and N. B. Venkateswarlu. "A critical study of selected classification algorithms for liver disease diagnosis." International Journal of Database Management Systems 3.2 (2011): 101-114.
[18]He, Hongxing, Warwick Graco, and Xin Yao. "Application of genetic algorithm and k-nearest neighbour method in medical fraud detection." Asia-Pacific Conference on Simulated Evolution and Learning. Springer Berlin Heidelberg, 1998.
[19]Goshvarpour, Ateke, and Atefeh Goshvarpour. "Radial Basis Function and K-Nearest Neighbor Classifiers for Studying Heart Rate Signals during Meditation." International Journal of Modern Education and Computer Science 4.4 (2012): 43.
[20]Ahirwar, Anamika. "Study of techniques used for medical image segmentation and computation of statistical test for region classification of brain MRI." International Journal of Information Technology and Computer Science (IJITCS) 5.5 (2013): 44.