Clustering of Multi Scripts Isolated Characters Using k-Means Algorithm

Full Text (PDF, 457KB), PP.22-29

Views: 0 Downloads: 0

Author(s)

Neeru Garg 1 Munish Kumar 2

1. Computer Science & Technology, Lovely Professional University, Phagwara, Punjab

2. Department of Computer Science, Punjab University Rural Centre, Kauni, Muktsar, Punjab

* Corresponding author.

DOI: https://doi.org/10.5815/ijmsc.2015.02.03

Received: 30 Apr. 2015 / Revised: 3 Jun. 2015 / Accepted: 10 Jul. 2015 / Published: 8 Aug. 2015

Index Terms

Clustering, Script identification, Stroke density, Zoning, k-Means

Abstract

The aim of this paper is script identification problem of handwritten text which facilitates the clustering of data according to their type of script. In this paper, collection of different types of handwritten text document i.e. Devanagari, Gurumukhi and Roman is taken as input and then cluster of all these documents according to script type whether i.e. Devanagari, Gurumukhi, or Roman was prepared. Clustering of handwritten multi-script document scheme proposed in this paper is divided into two phases. First phase used to extract the features of given text images. In the second phase, features extracted in the previous phase were used for clustering with k-Means algorithm. In feature extraction phase, we have extracted four types of features, namely, circular curvature feature, horizontal stroke density feature, pixel density feature value and zoning based feature. In this study, we have considered 4,850 samples of isolated characters of Devanagari, Gurumukhi and Roman script.

Cite This Paper

Neeru Garg, Munish Kumar,"Clustering of Multi Scripts Isolated Characters Using k-Means Algorithm", International Journal of Mathematical Sciences and Computing(IJMSC), Vol.1, No.2, pp.22-29, 2015. DOI: 10.5815/ijmsc.2015.02.03

Reference

[1] P. E. Ajmire and S. E. Warkhede, "Handwritten Marathi character (vowel) recognition", Advances in Information Mining ( 0975–3265), Vol. 2, pp.11-13, 2010.

[2] C. Sureshkumar and T. Ravichandran, "Handwritten Tamil Character Recognition using RCS Algorithm", International Journal of Computer Applications (0975 – 8887), Vol. 8(8), pp. 21-25, 2010

[3] B. V. Dhandra and H. Mallikarjun, "Global and Local Features Based Handwritten Text Words and Numerals Script Identification", International Conference on Computational Intelligence and Multimedia Applications, Vol. 2, pp. 471-475, 2007.

[4] H. A. Kumar and T. Ravinder, "Comparative Study of Different Classifiers for Devanagari Handwritten Character Recognition", International Journal of Engineering Science and Technology, Vol. 2 (7), pp. 2681-2689, 2010.

[5] S. Kumar, G. Shrivastava and S. Sanjay, "Support Vector Machine for Handwritten Devanagari Numeral Recognition", International Journal of Computer Applications (0975 – 8887), Vol. 7 (11), pp. 9-14, 2010.

[6] U. Pal, T. Wakabayashi and F. Kimura, "Handwritten Bangla Compound Character Recognition using Gradient Feature", International Conference on Information Technology, pp. 208-213, 2007.

[7] S. V. Rajashekararadhya and V. P. VanajaRanjan, "Efficient Zone Based Feature Extraction Algorithm for Handwritten Numeral of Four Popular South Indian Scripts", Journal of Theoretical and Applied Information Technology, pp. 1171-1181, 2008.

[8] G. G. Rajput and H. B. Anita, "Handwritten Script Recognition using DCT and Wavelet Features at Block Level", IJCA Special issue on Recent Trends in Image Processing and Pattern Recognition(RTIPPR), pp. 158-163, 2010.

[9] D. V. Sharma and U. Jain, "Recognition of Isolated Handwritten Characters of Gurumukhi Script using Neocognitron", International Journal of Computer Applications (0975–8887), Vol. 10 (8), pp. 10-16, 2010.

[10] D. V. Sharma and P. Jhajj, "Recognition of Isolated Handwritten Characters in Gurumukhi Script", International Journal of Computer Applications, Vol.4 (8), pp. 9-17, 2010.

[11] M. Kumar, R. K. Sharma and M. K. Jindal, "Segmentation of Lines and Words in Handwritten Gurmukhi Script Documents", Proceedings of the First International Conference on Intelligent Interactive Technologies and Multimedia, Allahabad, pp. 28-30, 2010.

[12] M. Kumar, M. K. Jindal and R. K. Sharma, "Segmentation of Isolated and Touching Characters in Offline Handwritten Gurmukhi Script Recognition", International Journal of Information Technology and Computer Science, Vol. 6(2), pp. 58-63, 2014.