Possibilistic Fuzzy Clustering for Categorical Data Arrays Based on Frequency Prototypes and Dissimilarity Measures

Full Text (PDF, 554KB), PP.55-61

Views: 0 Downloads: 0

Author(s)

Zhengbing Hu 1,* Yevgeniy V. Bodyanskiy 2 Oleksii K. Tyshchenko 2 Viktoriia O. Samitova 2

1. School of Educational Information Technology, Central China Normal University, Wuhan, China

2. Kharkiv National University of Radio Electronics, Kharkiv, Ukraine

* Corresponding author.

DOI: https://doi.org/10.5815/ijisa.2017.05.07

Received: 6 Jul. 2016 / Revised: 1 Oct. 2016 / Accepted: 22 Dec. 2016 / Published: 8 May 2017

Index Terms

Computational Intelligence, Machine Learning, Categorical Data, Categorical Scale, Possibilistic Fuzzy Clustering, Frequency Prototype, Dissimilarity Measure

Abstract

Fuzzy clustering procedures for categorical data are proposed in the paper. Most of well-known conventional clustering methods face certain difficulties while processing this sort of data because a notion of similarity is missing in these data. A detailed description of a possibilistic fuzzy clustering method based on frequency-based cluster prototypes and dissimilarity measures for categorical data is given.

Cite This Paper

Zhengbing Hu, Yevgeniy V. Bodyanskiy, Oleksii K. Tyshchenko, Viktoriia O. Samitova,"Possibilistic Fuzzy Clustering for Categorical Data Arrays Based on Frequency Prototypes and Dissimilarity Measures", International Journal of Intelligent Systems and Applications(IJISA), Vol.9, No.5, pp.55-61, 2017. DOI:10.5815/ijisa.2017.05.07

Reference

[1]A.K. Jain and R.C. Dubes, Algorithms for Clustering Data. Englewood Cliffs, N.J.: Prentice Hall, 1988.
[2]L. Kaufman and P.J. Rousseeuw, Finding Groups in Data: An Introduction to Cluster Analysis. N.Y.: John Wiley & Sons, Inc., 1990.
[3]J. Han and M. Kamber, Data Mining: Concepts and Techniques. San Francisco: Morgan Kaufmann, 2006.
[4]G. Gan, C. Ma, and J. Wu. Data Clustering: Theory, Algorithms, and Applications. Philadelphia: SIAM, 2007.
[5]J. Abonyi and B. Feil, Cluster Analysis for Data Mining and System Identification. Basel: Birkhäuser, 2007.
[6]D.L. Olson and D. Dursun, Advanced Data Mining Techniques. Berlin: Springer, 2008.
[7]C.C. Aggarwal and C.K. Reddy, Data Clustering: Algorithms and Applications. Boca Raton: CRC Press, 2014.
[8]K.-L. Du and M.N.S. Swamy, Neural Networks and Statistical Learning. London: Springer- Verlag, 2014.
[9]T. Hastie, R. Tibshirani, and J. Friedman, The Elements of Statistical Learning. Data Mining, Inference, and Prediction. N.Y.: Springer Science & Business Media, LLC, 2009.
[10]C.C. Aggarwal, Data Mining. Cham: Springer, Int. Publ. Switzerland, 2015.
[11]J.C. Bezdek, Pattern Recognition with Fuzzy Objective Function Algorithms. N.Y.: Plenum Press, 1981.
[12]F. Hoeppner, F. Klawonn, R. Kruse, T. Runkler, Fuzzy Clustering Analysis: Methods for Classification, Data Analysis and Image Recognition. Chichester: John Wiley & Sons, 1999.
[13]J.C. Bezdek, J. Keller, R. Krisnapuram, and N. Pal, Fuzzy Models and Algorithms for Pattern Recognition and Image Processing. N.Y.: Springer Science and Business Media, Inc., 2005.
[14]Zh. Hu, Ye.V. Bodyanskiy, O.K. Tyshchenko, O.O. Boiko,"An Ensemble of Adaptive Neuro-Fuzzy Kohonen Networks for Online Data Stream Fuzzy Clustering", International Journal of Modern Education and Computer Science (IJMECS), Vol.8, No.5, pp.12-18, 2016.
[15]Zh. Hu, Ye.V. Bodyanskiy, O.K. Tyshchenko, and O.O. Boiko, “An Evolving Cascade System Based on a Set of Neo-Fuzzy Nodes”, International Journal of Intelligent Systems and Applications (IJISA), Vol. 8(9), pp.1-7, 2016.
[16]Ye. Bodyanskiy, O. Tyshchenko, and D. Kopaliani, “A hybrid cascade neural network with an optimized pool in each cascade”, Soft Computing, Vol.19, No.12, pp.3445-3454, 2015.
[17]Ye. Bodyanskiy, O. Tyshchenko, and D. Kopaliani, “An Evolving Cascade Neuro-Fuzzy System for Data Stream Fuzzy Clustering”, in International Journal of Computer Science and Mobile Computing (IJCSMC), 2015, vol. 4(9), pp.270-275.
[18]Ye. Bodyanskiy, O. Tyshchenko, and D. Kopaliani, “Adaptive learning of an evolving cascade neo-fuzzy system in data stream mining tasks”, Evolving Systems, Vol.7, No.2, pp.107-116, 2016.
[19]Ye. Bodyanskiy, O. Tyshchenko, and A. Deineko, “An Evolving Radial Basis Neural Network with Adaptive Learning of Its Parameters and Architecture”, Automatic Control and Computer Sciences, Vol. 49, No. 5, pp. 255-260, 2015.
[20]Ye. Bodyanskiy, O. Tyshchenko, and D. Kopaliani, “An evolving neuro-fuzzy system for online fuzzy clustering”, Proc. Xth Int. Scientific and Technical Conf. “Computer Sciences and Information Technologies (CSIT’2015)”, pp.158-161, 2015.
[21]R. Xu and D.C. Wunsch, Clustering. Hoboken, NJ: John Wiley & Sons, Inc. 2009.
[22]Zh. Hu, Ye.V. Bodyanskiy, and O.K. Tyshchenko, “A Cascade Deep Neuro-Fuzzy System for High-Dimensional Online Possibilistic Fuzzy Clustering”, Proc. of the XI-th International Scientific and Technical Conference “Computer Science and Information Technologies” (CSIT 2016), 2016, Lviv, Ukraine, pp.119-122.
[23]Zh. Hu, Ye.V. Bodyanskiy, and O.K. Tyshchenko, “A Deep Cascade Neuro-Fuzzy System for High-Dimensional Online Fuzzy Clustering”, Proc. of the 2016 IEEE First Int. Conf. on Data Stream Mining & Processing (DSMP), 2016, Lviv, Ukraine, pp.318-322.
[24]Zh. Hu, Ye.V. Bodyanskiy, O.K. Tyshchenko, V.O. Samitova,"Fuzzy Clustering Data Given in the Ordinal Scale", International Journal of Intelligent Systems and Applications (IJISA), Vol.9, No.1, pp.67-74, 2017.
[25]Zh. Hu, Ye.V. Bodyanskiy, O.K. Tyshchenko, V.O.
Samitova, "Fuzzy clustering data given on the ordinal scale based on membership and likelihood functions sharing", International Journal of Intelligent Systems and Applications (IJISA), Vol.9, No.2, pp.1-9, 2017.
[26]Zh. Huang, “Extensions to the k-means algorithm for clustering large data sets with categorical values”, in Data Mining and Knowledge Discovery, 1998, vol. 2(2), pp.283-304.
[27]Z. He, S. Deng, and X. Xu, “Improving k-modes algorithm considering frequencies of attribute values in mode”, in Lecture Notes in Computer Science. Computational Intelligence and Security, 2005, vol. 3801, pp.157-162.
[28]M. Lei, P. He, and Zh. Li, “An improved k-means algorithm for clustering categorical data”, in Journal of Communications and Computer, 2006, vol. 3(8), pp.20-24.
[29]J.-P. Mei and L. Chen, “Fuzzy relational clustering around medoids: A unified view”, in Fuzzy Sets and Systems, 2011, vol. 183(1), pp.44-56.
[30]H.-J. Xing and M.-H. Ha, “Further improvements in Feature-Weighted Fuzzy C-Means”, in Information Sciences, 2014, vol. 267, pp.1-15.
[31]L. Svetlova, B. Mirkin, H. Lei, “MFWK-Means: Minkowski metric Fuzzy Weighted K-Means for high dimensional data clustering”, IEEE 14th International Conference on Information Reuse and Integration (IRI), 2013.
[32]G. Sudipto, R. Rajeev, and S. Kyuseok, “ROCK: A Robust Clustering Algorithm for Categorical Attributes”, Proc. of the IEEE Int. Conf. on Data Engineering, Sydney, 1999, pp.512-521.
[33]P. Jaccard, “Distribution de la flore alpine dans le Bassin des Dranses et dans quelques regions voisines”, in Bull. Soc. Vaudoise sci. Natur., 1901, vol. 37(140), pp. 241-272.
[34]Zh. Huang and M.K. Ng, “A fuzzy k-modes algorithm for clustering categorical data”, IEEE Trans. on Fuzzy Systems, 1999, vol. 7(4), pp.446-452.
[35]D.W. Kim, K.H. Lee, and D. Lee, “Fuzzy clustering of categorical data using fuzzy centroids”, in Pattern Recognition Letters, 2004, vol. 25, pp.1263-1271.
[36]M. Lee, “Fuzzy p-mode prototypes: A generalization of frequency-based cluster prototypes for clustering categorical objects”, in Computational Intelligence and Data Mining, 2009, pp.320-323.
[37]Ye. Bodyanskiy, V. Kolodyazhniy, and A. Stephan, “Recursive fuzzy clustering algorithms”, Proc. 10th East–West Fuzzy Colloquium, 2002, pp.276-283.
[38]Ye. Bodyanskiy, “Computational intelligence techniques for data analysis”, in Lecture Notes in Informatics, 2005, P-72, pp.15–36.
[39]R. Krishnapuram and J. Keller, “A possibilistic approach to clustering”, in IEEE Trans. on Fuzzy Systems, 1993, vol.2(1), pp.98-110.