A Clustering Algorithm based on Local Density of Points

Full Text (PDF, 940KB), PP.9-16

Views: 0 Downloads: 0

Author(s)

Ahmed Fahim 1,2,*

1. Faculty of Sciences and Humanitarian Study,Prince Sattam Bin Abdulaziz University, Al-Aflaj, KSA

2. Faculty of computers and information, Suez University, Suez, Egypt.

* Corresponding author.

DOI: https://doi.org/10.5815/ijmecs.2017.12.02

Received: 16 Oct. 2017 / Revised: 25 Oct. 2017 / Accepted: 8 Nov. 2017 / Published: 8 Dec. 2017

Index Terms

Clustering methods, CBLDP algorithm, Cluster analysis, DBSCAN algorithm

Abstract

Data clustering is very active and attractive research area in data mining; there are dozens of clustering algorithms that have been published. Any clustering algorithm aims to classify data points according to some criteria. DBSCAN is the most famous and well-studied algorithm. Clusters are recorded as dense regions separated from each other by spars regions. It is based on enumerating the points in Eps-neighborhood of each point. This paper proposes a clustering method based on k-nearest neighbors and local density of objects in data; that is computed as the total of distances to the most near points that affected on it. Cluster is defined as a continuous region that has points within local densities fall between minimum local density and maximum local density. The proposed method identifies clusters of different shapes, sizes, and densities. It requires only three parameters; these parameters take only integer values. So it is easy to determine. The experimental results demonstrate the superior of the proposed method in identifying varied density clusters.

Cite This Paper

Ahmed Fahim, "A Clustering Algorithm based on Local Density of Points", International Journal of Modern Education and Computer Science(IJMECS), Vol.9, No.12, pp. 9-16, 2017. DOI:10.5815/ijmecs.2017.12.02

Reference

[1]R. Xu, D. Wunsch “Survey of clustering algorithms.” IEEE Transactions on neural networks, vol. 16(3), pp. 645-678.‏ 2005.
[2]P. Berkhin “A survey of clustering data mining techniques”. Grouping multidimensional data: Recent Advances in Clustering, springer, pp. 25-71. 2006.
[3]J. A. Hartigan, M. A. Wong “Algorithm AS 136: A k-means clustering algorithm.” Journal of the Royal Statistical Society. Series C (Applied Statistics), vol. 28(1), 100-108,‏ 1979.
[4]L. Kaufman, P. J. Rousseeuw Partitioning around medoids (program pam). Finding groups in data: an introduction to cluster analysis. John Wiley & Sons, 1990.
[5]R. T. Ng, J. Han “CLARANS: A method for clustering objects for spatial data mining”. IEEE transactions on knowledge and data engineering, vol. 14(5), pp. 1003-1016, 2002.
[6]R. Sibson “SLINK: an optimally efficient algorithm for the single-link cluster method.” The computer journal, vol. 16(1): pp. 30-34, 1973.
[7]H. K. Seifoddini “Single linkage versus average linkage clustering in machine cells formation applications.” Computers & Industrial Engineering, vol. 16(3), pp. 419-426.‏ 1989.
[8]D. Defays “An efficient algorithm for a complete link method.” The Computer Journal, Vol. 20(4): pp.364-366, 1977.
[9]S. Guha, R. Rajeev, S. Kyuseok “CURE: an efficient clustering algorithm for large databases.” ACM Sigmod Record. ACM, vol. 27(2), pp. 73-84. 1998.
[10]G. Karypis, E. Han, V. Kumar “Chameleon: Hierarchical clustering using dynamic modeling.” Computer, vol. 32(8), pp. 68-75, 1999.
[11]T. Zhang, R. Ramakrishnan, M. Livny “BIRCH: an efficient data clustering method for very large databases.” ACM Sigmod Record. ACM, vol. 25(2),‏ pp.103-114. 1996.
[12]S., Guha R. Rastogi, K. Shim “ROCK: A robust clustering algorithm for categorical attributes.” Data Engineering, 1999. Proceedings 15th International Conference on.IEEE, pp. 512-521,1999.
[13]M. Ester, H. P. Kriegel, J. Sander, X. Xu “Density-based spatial clustering of applications with noise.” Int. Conf. Knowledge Discovery and Data Mining, vol. 240,‏ 1996.
[14]P. Liu, D. Zhou, N. Wu “VDBSCAN: varied density based spatial clustering of applications with noise.” In: Service Systems and Service Management, 2007 International Conference on. IEEE, pp. 1-4,‏ 2007.
[15]C. F. Tsai, C. W. Liu “KIDBSCAN: a new efficient data clustering algorithm.” In ICAISC, pp.702-711,‏ 2006.
[16]B. Borah, D. K. Bhattacharyya “DDSC: a density differentiated spatial clustering technique” Journal of computers, vol. 3(2), pp.72-79,‏ 2008.
[17]A. Fahim, A.-E Salem , F. Torkey, M. Ramadan, G. Saake “Scalable varied density clustering algorithm for large datasets.” Journal of Software Engineering and Applications, vol. 3(06), pp. 593-602,‏ 2010.
[18]A. M. Fahim “A clustering algorithm for discovering varied density clusters. International Research Journal of Engineering and Technology, vol. 2(08), ‏pp.566-573, 2015.
[19]A. Hinneburg, D. A. Keim “An efficient approach to clustering in large multimedia databases with noise. KDD, pp. 58-65, 1998.
[20]M. Ankerst, M. M. Breunig, H.-P. Kriegel, J. Sander. “OPTICS: Ordering Points To Identify the Clustering Structure”, Int. Conf. on Management of Data, Philadelphia PA, in ACM Sigmod record vol. 28(2), pp. 49-60, 1999.
[21]M. T. H. Elbatta and W. M. Ashour. “A Dynamic Method for Discovering Density Varied Clusters”. International Journal of Signal Processing, Image Processing and Pattern Recognition vol. 6(1), pp.123-134, February, 2013.
[22]S. Louhichi, M. Gzara, H. Abdallah. “A density based algorithm for discovering clusters with varied density”. IEEE conf. 2014.
[23]M. Debnath, P. K. Tripathi, R. Elmasri. “K-DBSCAN: Identifying Spatial Clusters With Differing Density Levels”. International Workshop on Data Mining with Industrial Applications, pp. 51-60, 2015.
[24]D. Cheng, Q. Zhu, J. Huang, L. Yang, Q. Wu “Natural neighbor-based clustering algorithm with local representatives.” Knowledge-Based Systems, vol. 123, pp.238-253, 2017.