IJMECS Vol. 7, No. 6, 8 Jun. 2015
Cover page and Table of Contents: PDF (size: 410KB)
Clusters, clustering algorithms, Euclidian distance, Data Mining
K-means algorithm is one of the most popular algorithms for data clustering. With this algorithm, data of similar types are tried to be clustered together from a large data set with brute force strategy which is done by repeated calculations. As a result, the computational complexity of this algorithm is very high. Several researches have been carried out to minimize this complexity. This paper presents the result of our research, which proposes a modified version of k-means algorithm with an improved technique to divide the data set into specific numbers of clusters with the help of several check point values. It requires less computation and has enhanced accuracy than the traditional k-means algorithm as well as some modified variant of the traditional k-Means.
Sharfuddin Mahmood, Mohammad Saiedur Rahaman, Dip Nandi, Mashiour Rahman, "A Proposed Modification of K-Means Algorithm", International Journal of Modern Education and Computer Science (IJMECS), vol.7, no.6, pp.37-42, 2015. DOI:10.5815/ijmecs.2015.06.06
[1]S. AL Manaseer, A. Malibari, “Improved Teaching Method of Data Mining Course”, I.J. Modern Education and Computer Science, Second Volume, Page-15-22, 2012.
[2]L. Su,H. Liu, Z. Song, “A New Classification for Data Stream”, I.J. Modern education and computer science, Fourth Volume, Page: 32-39,2011.
[3]S. Jigui, Q. Keyun, “Research on Modified K-Means Data Clusters ”, Computer Engineering, Volume: 33, No: 13, page: 200-201.
[4]Mac Queen JB. “Some methods for classification and analysis of multivariate observations”. Proceeding of the Fifth Berkley Symposium Math. Stat. Prob, (1):281-297, 1967.
[5]Huang Z, “Extensions to the K-Means algorithm for clustering large data set with categorical values”, Data mining and knowledge discovery, Vol. 2, Page: 283-304, 1998.
[6]S. Deelers, S. Auwantanamongkol,”Enhancing k-mean Algorithm with Initial Cluster Center Derived from Data Partitioning along the Data Axis with the Highest Variance”, International Journal of Computer Science Vol:1, 2007.
[7]J. Han, M. Kamber, J. Pei, “Data Mining- Concepts and techniques”, Third Edition, Chapter: 7, Page: 401.
[8]S. Na, G. Yong, L. Xumin,”Research on k-means Clustering Algorithm”, Third International Symposium on Intelligent Information Technology and Security Information, 2010.
[9]M. Yedla, S. R. Pathakota, T.M. Srinivasa, “Enhancing K-means Clustering Algorithm with Improved Initial Center”, International Journal of Computer Science and Information Technologies, Vol. 1(2):121-125, 2010.
[10]A. Triantafillakis P. Kanellis, D. Martakos, “Data Warehouse Clustering on the web”, European Journal of Operational Research, 160(2):353-364, 2005.
[11]M. H. Dunham, “Data Mining- Introductory and Advanced Concepts”, Pearson Education,2006.
[12]C.C. Aggarwal, “A Human-Computer Interactive Method for Projected Clustering”, IEEE Transactions on Knowledge and Data Engineering,Vol 16(4) 448-460, 2004.
[13]A. M. Fahim, A.M. Salem, F.A Torkey and M.A. Ramadan, “An Efficient Enhanced K-Means clustering algorithm ”, Journal of Zhejiang University. 10(7), 16261633, 2006.
[14]K. A. Abdul Nazeer, M.P. Sebastian, “Improving the Accuracy and efficiency of the K- Means Clustering Algorithm”, International Conference on Data Mining and Knowledge Engineering (ICDMKE). Proceeding of the World Congress on Engineering(WCE-2009), Volume : 1 , 2009.
[15]K. Arai, A.R. Barakbah, “Hierarchical K-Means: an algorithm for Centroids initialization for K-Means ”, Department of Information Science and Electrical Engineering Politechnique in Surabay, Faculty of Science and Engineering, Saga University, Volume 36, No: 1, 2007.
[16]J. Wang, X. Su,” An Improved K-Means Algorithm”, IEEE 3rd International Conference on Communication Software and Networks (ICCSN), 44-46, 2011,
[17]Chen Zhang, Shixiong Xia, “K-Means Clustering Algorithm With Improved Initial Center”, ISBN: 978-0-7695-3543-2, pp:790-792.
[18]University of California, Irvine, https://archive.ics.uci.edu/ml/datasets.html.