Maedeh Afzali

Work place: Manav Rachna International Institute of Research and Studies, Faridabad, 121004, India

E-mail: maedeh.af@gmail.com

Website:

Research Interests: Data Mining, Data Compression, Data Structures and Algorithms

Biography

Maedeh Afzali: She was born in Iran 1988. She received her Bachelor degree in Software Engineering from Islamic Azad University of Brigand, Iran. At 2013, she completed her M.tech degree in Computer Science and Engineering from Manav Rachna International Institute of Research and Studies, India. At present, she is pursuing her Ph.D. in Computer Science in the Department of Computer Science and Engineering at Manav Rachna International Institute of Research and Studies, India. Her research interests include Big Data, Data Analytics, Data Mining, Text Mining.

Author Articles
An Extensive Study of Similarity and Dissimilarity Measures Used for Text Document Clustering using K-means Algorithm

By Maedeh Afzali Suresh Kumar

DOI: https://doi.org/10.5815/ijitcs.2018.09.08, Pub. Date: 8 Sep. 2018

In today’s world tremendous amount of unstructured data, especially text, is being generated through various sources. This massive amount of data has lead the researchers to focus on employing data mining techniques to analyse and cluster them for an efficient browsing and searching mechanisms. The clustering methods like k-means algorithm perform through measuring the relationship between the data objects. Accurate clustering is based on the similarity or dissimilarity measure that is defined to evaluate the homogeneity of the documents. A variety of measures have been proposed up to this date. However, all of them are not suitable to be used in the k-means algorithm. In this paper, an extensive study is done to compare and analyse the performance of eight well-known similarity and dissimilarity measures that are applicable to the k-means clustering approach. For experiment purpose, four text document data sets are used and the results are reported.

[...] Read more.
Other Articles