Homogeneous Densities Clustering Algorithm

Full Text (PDF, 657KB), PP.1-10

Views: 0 Downloads: 0

Author(s)

Ahmed Fahim 1,2,*

1. Faculty of Sciences and Humanitarian Study, Prince Sattam Bin Abdulaziz University, Al-Aflaj, KSA

2. Faculty of computers and information, Suez University, Suez, Egypt

* Corresponding author.

DOI: https://doi.org/10.5815/ijitcs.2018.10.01

Received: 1 Sep. 2018 / Revised: 18 Sep. 2018 / Accepted: 24 Sep. 2018 / Published: 8 Oct. 2018

Index Terms

Cluster analysis, DBSCAN algorithm, clustering algorithms, homogeneous clusters

Abstract

Clustering based-density is very attractive research area in data clustering. DBSCAN (Density-Based Spatial Clustering of Applications with Noise) algorithm is the pioneer in this area. It can handle varied shaped and sized clusters. Also, it copes with noise and outliers efficiently, however it fails to handle clusters with varied densities due to the global parameter Eps. In this paper, we propose a method overcomes this problem, this method does not allow large variation in density within a cluster and use only two input parameters that will be called minpts and maxpts. They govern the minimum and maximum density of core objects within a cluster. The maxpts parameter will be used to control the value of Eps (neighborhood radius) in original DBSCAN. By allowing Eps to be varied from one cluster to another based on density of region this make DBSCAN able to handle varied density clusters and discover homogeneous clusters. The experimental results reflect the efficiency of the proposed method despite its simplicity.

Cite This Paper

Ahmed Fahim, "Homogeneous Densities Clustering Algorithm", International Journal of Information Technology and Computer Science(IJITCS), Vol.10, No.10, pp.1-10, 2018. DOI:10.5815/ijitcs.2018.10.01

Reference

[1]P. Berkhin “A survey of clustering data mining techniques”, Grouping multidimensional data: Recent Advances in Clustering, springer, pp. 25-71. 2006.

[2]J. A. Hartigan, M. A. Wong “Algorithm AS 136: A k-means clustering algorithm.” Journal of the Royal Statistical Society. Series C (Applied Statistics), vol. 28, no. 1, pp.100-108,‏ 1979.

[3]L. Kaufman, P. J. Rousseeuw “Partitioning around medoids (program pam)”, Finding groups in data: an introduction to cluster analysis. John Wiley & Sons, 1990.

[4]R. T. Ng, J. Han “CLARANS: A method for clustering objects for spatial data mining”, IEEE transactions on knowledge and data engineering, vol. 14, no. 5, pp. 1003-1016, 2002.

[5]R. Sibson “SLINK: an optimally efficient algorithm for the single-link cluster method.” The computer journal, vol. 16, no. 1, pp. 30-34, 1973.

[6]H. K. Seifoddini “Single linkage versus average linkage clustering in machine cells formation applications”, Computers & Industrial Engineering, vol. 16, no. 3, pp. 419-426, 1989.

[7]D. Defays “An efficient algorithm for a complete link method.” The Computer Journal, Vol. 20, no. 4, pp.364-366, 1977.

[8]S. Guha, R. Rajeev, S. Kyuseok “CURE: an efficient clustering algorithm for large databases.” ACM Sigmod Record. ACM, vol. 27, no. 2, pp. 73-84, 1998.

[9]T. Zhang, R. Ramakrishnan, M. Livny “BIRCH: an efficient data clustering method for very large databases.” ACM Sigmod Record. ACM, vol. 25, no. 2,‏ pp.103-114, 1996.

[10]G. Karypis, E. Han, V. Kumar “Chameleon: Hierarchical clustering using dynamic modeling.” Computer, vol. 32, no. 8, pp. 68-75, 1999.

[11]S., Guha R. Rastogi, K. Shim “ROCK: A robust clustering algorithm for categorical attributes.” Data Engineering, 1999. Proceedings 15th International Conference on. IEEE, pp. 512-521,1999.

[12]M. Ester, H. P. Kriegel, J. Sander, X. Xu “Density-based spatial clustering of applications with noise.” Int. Conf. Knowledge Discovery and Data Mining, vol. 96, no. 34,‏ pp. 226-231, August 1996

[13]M. Ankerst, M. M. Breunig, H.P. Kriegel, and J. Sander. "OPTICS: ordering points to identify the clustering structure." In ACM Sigmod record, vol. 28, no. 2, pp. 49-60. ACM, 1999.

[14]A., Hinneburg D. A. Keim “An efficient approach to clustering in large multimedia databases with noise. KDD, pp. 58-65, 1998.

[15]Wei Wang, Jiong Yang, and Richard Muntz. "STING: A statistical information grid approach to spatial data mining." In VLDB, vol. 97, pp. 186-195. 1997.

[16]Anant Ram, Sunita Jalal, Anand S. Jalal, Manoj Kumar. “A Density based Algorithm for Discovering Density Varied Clusters in Large Spatial Databases” International Journal of Computer Applications,vol. 3, no.6, pp.1-4, June 2010. 

[17]Mohammed T. H. Elbatta and Wesam M. Ashour.  “A Dynamic Method for Discovering Density Varied Clusters” International Journal of Signal Processing, Image Processing and Pattern Recognition , vol. 6, no. 1, pp.123-134, February 2013.

[18]Soumaya Louhichi, Mariem Gzara, Hanène Ben Abdallah “A density based algorithm for discovering clusters with varied density” In Computer Applications and Information Systems (WCCAIS), 2014 World Congress on IEEE conf., pp. 1-6, January 2014.

[19]Madhuri Debnath, Praveen Kumar Tripathi, Ramez Elmasri. “K-DBSCAN: Identifying Spatial Clusters With Differing Density Levels”, International Workshop on Data Mining with Industrial Applications, pp. 51-60, 2015.

[20]Ahmed Fahim, "A Clustering Algorithm based on Local Density of Points", International Journal of Modern Education and Computer Science (IJMECS), Vol.9, No.12, pp. 9-16, 2017.