IJITCS Vol. 8, No. 2, 8 Feb. 2016
Cover page and Table of Contents: PDF (size: 209KB)
Full Text (PDF, 209KB), PP.47-51
Views: 0 Downloads: 0
Modified k-means clustering, h-index, g-index
In this paper I proposed modified K-means algorithm as the means to assess scientific authors performance by using their h,g-indices values. K-means suffers from poor computational scaling and efficiency as the number of clusters has to be supplied by the user. In this work, I introduce a modification of K-means algorithm that efficiently searches the data to cluster points by compute the sum of squares within each cluster which makes the program to select the most promising subset of classes for clustering. The proposed algorithm was tested on IRIS and ZOO data sets as well as on our local dataset comprising of h- and g-indices, which are the prominent markers for scientific excellence of authors publishing papers in various national and international journals. Results from analyses reveal that the modified k-means algorithm is much faster and outperforms the conventional algorithm in terms of clustering performance, measured by the data discrepancy factor.
S. Govinda Rao, A. Govardhan, "Evaluation of H- and G-indices of Scientific Authors using Modified K-Means Clustering Algorithm", International Journal of Information Technology and Computer Science(IJITCS), Vol.8, No.2, pp.47-51, 2016. DOI:10.5815/ijitcs.2016.02.06
[1]Http://www.sagepub.com/upm-data/29986_Chapter3.pdf.
[2]G Charles Babu and Dr. A.GOVARDHAN, “Mining Scientific Data from Pub-Med Database” International Journal of Advanced Computer Science and Applications (IJACSA), 3(4), 2012.
[3]Richard Van Noorden (2013). Open access: The true cost of science publishing. Nature 495, 426–429.
[4]Solomon, D. J. & Bj?rk, B.-C. J. Am. Soc. Inf. Sci. Technol. 63, 1485–1495.
[5]Jerry A. Jacobs and Scott Frickel. Interdisciplinarity: A Critical Assessment. Annual Review of Sociology, 35: 43 -65 (2009)
[6]http://en.wikipedia.org/wiki/Impact_factor.
[7]Hirsch, J. E. (2005). "An index to quantify an individual's scientific research output". PNAS 102 (46): 16569–16572.
[8]Jacso, P. (2008b). The pros and cons of computing the h-index using Google Scholar. Online Information Review, 32(3), 437–452.
[9]Jin, B. (2006). h-Index: An evaluation indicator proposed by scientist. Science Focus, 1(1), 8–9.
[10]J. B. MacQueen (1967): "Some Methods for classification and Analysis of Multivariate Observations, Proceedings of 5-th Berkeley Symposium on Mathematical Statistics and Probability", Berkeley, University of California Press, 1:281-297.
[11]S. Alonso, F.J. Cabrerizo, E. Herrera-Viedma, F. Herrera (2009). h-Index: A review focused in its variants, computation and standardization for different scienti?c ?elds. Journal of Informetrics 3: 273–289.
[12]Google Scholar. (online resource). http://scholar.google.com/.
[13]EGGHE, L. (2006), Theory and practise of the g-index. Scientometrics, 69 (1): 131–152.
[14]Fisher R. A. (1936), The use of multiple measurements in taxonomic problems., Annals of Eugenics, Vol. 7, pp. 179-188.
[15]Panda S., Sahu S., Jena P.K., Chattopadhyay S. (2012), Comparing Fuzzy-C means and -means Clustering Techniques: a Comprehensive Study. In Proceedings of 2ndInternational Conference on Computer Science, Engineering & Applications, by D.C. Wyld, J. Zizka, D. Nagamalai (Eds.), Advances in Intelligent and Soft Computing (AISC) Vol. 166, pp. 451-460. DOI: 10.1007/978-3-642-30157.
[16]S Govinda Rao, Dr A Govardhan. “Assessing h- and g- indices of scientific papers using k-means clustering” International Journal of Computer Applications(0975-8887), Vol.100-No.11,August 2014.