A Technique to Choose the Proper Vector Space Models of Semantics in Case of Automatic Text Categorization

Full Text (PDF, 363KB), PP.36-42

Views: 0 Downloads: 0

Author(s)

Sukanya Ray 1,* Nidhi Chandra 1

1. Amity School Of Engineering & Technology, Amity University, Noida (U.P.), India

* Corresponding author.

DOI: https://doi.org/10.5815/ijmecs.2012.04.05

Received: 5 Dec. 2011 / Revised: 11 Jan. 2012 / Accepted: 10 Mar. 2012 / Published: 8 Apr. 2012

Index Terms

Vector Space Model, Term Document, Word Content, Pair Pattern

Abstract

Vides a proper solution to this limitation. There are broadly three main categories of Vector Space Model: term-document, word-content and pair-pattern matrices. The main aim of this paper is to discuss broadly the three main categories of VSM for semantic analysis of texts and make proper selection for automatic categorizing. The scenario taken up here is categorization of research papers for organizing a national or an international conference based on the proposed methodology. Computers do not understand human language and this makes it difficult when human wants the computer to do some specific task like categorization according to human need. Vector Space Model (VSM) for semantic analysis of texts and make proper selection of one of the three main categories for automatic categorizing of research papers for organizing a national or an international conference based on the proposed methodology.

Cite This Paper

Sukanya Ray, Nidhi Chandra, "A Technique to Choose the Proper Vector Space Models of Semantics in Case of Automatic Text Categorization", International Journal of Modern Education and Computer Science (IJMECS), vol.4, no.4, pp.36-42, 2012. DOI:10.5815/ijmecs.2012.04.05

Reference

[1]http://nlp.cs.nyu.edu/sk-symposium/slides/PeterTurney.pdf
[2]Manning, C. D., Raghavan, P., & Sch¨utze, H. (2008). "Introduction to Information Retrieval." Cambridge University Press, Cambridge, UK.
[3]Pantel, P., & Lin, D. (2002a). "Discovering word senses from text." In the proceedings of the Eighth ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, pp. 613–619, Edmonton, Canada.
[4]Choi, F. Y. Y. (2000)." Advances in domain independent linear text segmentation." In the proceedings of the 1st Meeting of the North American Chapter of the Association for Computational Linguistics, pp. 26–33.
[5]Dang, H. T., Lin, J., & Kelly, D. (2006). "Overview of the TREC 2006 question answering track." In the proceedings of the Fifteenth Text Retrieval Conference (TREC 2006).
[6]Chu-carroll, J., & Carpenter, B. (1999). "Vector-based natural language call routing." Computational Linguistics, 25 (3), 361–388.
[7]Rapp, R. (2003). "Word sense discovery based on sense descriptor dissimilarity." In the proceedings of the Ninth Machine Translation Summit, pp.315–322.
[8]Sch¨utze, H. (1998). "Automatic word sense discrimination." Computational Linguistics, 24 (1), 97–124.
[9]Curran, J. R., & Moens, M. (2002). "Improvements in automatic thesaurus extraction" .In Unsupervised Lexical Acquisition: Proceedings of the Workshop of the ACL Special Interest Group on the Lexicon (SIGLEX), pp. 59–66, Philadelphia, PA.
[10]Jones, M. P., & Martin, J. H. (1997). "Contextual spelling correction using latent semantic analysis." In the proceedings of the Fifth Conference on Applied Natural Language Processing, pp. 166–173, Washington, DC.
[11]Pennacchiotti, M., Cao, D. D., Basili, R., Croce, D., & Roth, M. (2008). "Automatic induction of FrameNet lexical units." In the proceedings of the 2008 Conference on Empirical Methods in Natural Language Processing (EMNLP-08), pp. 457–465, Honolulu, Hawaii.
[12]Cao, H., Jiang, D., Pei, J., He, Q., Liao, Z., Chen, E., & Li, H. (2008). "Context-aware query suggestion by mining click-through and session data." In the proceeding of the 14th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining (KDD '08), pp. 875–883. ACM.
[13]Turney, P. D. (2006). "Similarity of semantic relations." Computational Linguistics, 32 (3), 379–416.
[14]Lin, D., & Pantel, P. (2001). "DIRT – discovery of inference rules from text." In the proceedings of ACM SIGKDD Conference on Knowledge Discovery and Data Mining 2001, pp. 323–328.
[15]Davidov, D., & Rappoport, A. (2008). "Unsupervised discovery of generic relationships using pattern clusters and its evaluation by automatically generated SAT analogy questions." In the proceedings of the 46th Annual Meeting of the ACL and HLT (ACL-HLT-08), pp.692–700, Columbus, Ohio.
[16]Turney, P. D. (2008b). "A uniform approach to analogies, synonyms, antonyms, and associations." In the proceedings of the 22nd International Conference on Computational Linguistics (Coling 2008), pp. 905–912, Manchester, UK.
[17]Turney, P. D. (2008a). "The latent relation mapping engine: Algorithm and experiments." Journal of Artiļ¬cial Intelligence Research, 33, 615 -655.