Survey on Word Sense Disambiguation: An Initiative towards an Indo-Aryan Language

Full Text (PDF, 619KB), PP.37-52

Views: 0 Downloads: 0

Author(s)

Jumi Sarmah 1 Shikhar Kumar Sarma 1

1. Department of Information Technology, Gauhati University, Guwahati, Assam, 781014, India

* Corresponding author.

DOI: https://doi.org/10.5815/ijem.2016.03.04

Received: 14 Jan. 2016 / Revised: 1 Mar. 2016 / Accepted: 8 Apr. 2016 / Published: 8 May 2016

Index Terms

Assamese, Lexical Ambiguity, Natural Language Processing, Word Sense Disambiguation

Abstract

Resolution of lexical ambiguity, commonly known as Word Sense Disambiguation (WSD) task is to distinguish the correct sense among the set of senses for an ambiguous term depending on the particular context automatically. It plays the vital role as it acts as an intermediate phase to many Natural Language Processing (NLP) applications like Machine Translation, Information Retrieval, Speech Processing, Hypertext navigation, Parts-of -Speech tagging. Existing literature reveals that there are various approaches for lexical ambiguity resolution-Knowledge based, Corpus based. In recent years, many WSD systems is being developed in Indian languages like Hindi, Malayalam, Manipuri, Nepali, Kannada but no such automated system has yet emerged for the Indo-Aryan language- Assamese. Our future work aims to develop a model for the WSD problem which is fast, optimal and efficient in terms of accuracy and scalability. This paper presents a survey report made in this research topic discussing the WSD problem, various approaches along with their algorithms. Moreover it also list out the various NLP applications which would be efficient when disambiguation system is merged. Evaluation measures used to determine the WSD performance are also discussed here. 

Cite This Paper

Jumi Sarmah, Shikhar Kumar Sarma,"Survey on Word Sense Disambiguation: An Initiative towards an Indo-Aryan Language", International Journal of Engineering and Manufacturing(IJEM), Vol.6, No.3, pp.37-52, 2016. DOI: 10.5815/ijem.2016.03.04

Reference

[1]Fujii Atsushi, Corpus-Based Word Sense Disambiguation PhD Thesis, Department of computer science, Tokto Institute of Technology, March 1998.

[2]Shekhar Dash Niladri, Polysemy and Homonymy: A Conceptual Labyrinth; Proceedings of the 3rd IndoWordNet Workshop; pp. 01-07, IIT Kharagpur, India, 2012.

[3]Navigli R, Word Sense Disambiguation: A survey; ACM Computing Surveys, Vol. 41, No. 2, Article10, 2009.

[4]Carpuat M, Wu D, Improving statistical machine translation using word sense disambiguation; Proc. of EMNLP-CoNLL, 2007.

[5]S Chan Y, T Ng H, Domain adaptation with active learning for word sense disambiguation, Proc. of 45th Annual Meeting of the Association of Computational Linguistics, pp. 49–56, Prague, 2007.

[6]Vickrey D, Biewald L, Teyssier M, Koller D, Word-sense disambiguation for machine translation; Proc. of EMNLP, pp. 771–778, 2005.

[7]Brown PF, Stephen A, Pietra D, JD Pietra V, Word-sense disambiguation using statistical methods, Proc. of 29th Annual Meeting of the Association for Computational Linguistics, pp.264-270. 1991.

[8]Jean V, Hyperlex: Lexical cartography for information retrieval, Computer Speech & Language, Vol. 18 No.3, pp. 223-252, 2004.

[9]Fukumoto F, Suzuki Y, An automatic clustering of articles using dictionary definitions; Proc. of 16th International Conference on Computational Linguistics, pp. 406-411, 1996.

[10]McRoy SW, Using multiple knowledge sources for word sense discrimination, Computational Linguistics, Vol. 18, pp. 1-30. 1992.

[11]Wilks Y, Stevenson M, Sense tagging: Semantic tagging with a lexicon; Proc. of ACL SIGLEX Workshop on Tagging Text with Lexical Semantics: Why, What, and How?, pp. 47-51, 1997.

[12]Yu J, Huang L, Fu J, Mei D, A comparative study of Word Sense Disambiguation Of EnglishM odal Verb by BP Neural Network and Support Vector Machine, International Journal of Innovative Computing, Information and Control ICIC International, Volume 7, No. 5(A), 2011. 

[13]L Rivest Ronald, Learning Decision Lists; Machine Learning, pp. 229-246, 1987. 

[14]Sreedhar J, Viswanadha Raju S, Vinaya Babu A, Shaik A and Pavan Kumar P, Word Sense Disambiguation: An Empirical Survey, International Journal of Soft Computing and Engineering, Volume-2, Issue-2, pp. 494-503, May 2012.

[15]Pal Singh Gosal Gurinder, A Naive Bayes Approach for Word Sense Disambiguation; Published in IJARCSSE, Volume 5 Issue 7, 2015.

[16]Ng HT, Exemplar-based word sense disambiguation: Some recent improvements; Proc. of 2nd conference on Empirical methods in natural language processing, pp. 208-213, 1997.

[17]Boser BE, Guyon IM and Vapnik VN, "A training algorithm for optimal margin classifiers", Proc. of 5th Annual ACM Workshop on Computational Learning Theory, pp. 144–152, 1992.

[18]Yarowsky D, Word-sense disambiguation using statistical models of roget's categories trained on large corpora, Proc. of 14th conference on Computational linguistics, pp. 454–460, Morristown, USA. 1992.

[19]Amruta P and Pedersen T, Word sense discrimination by clustering contexts in vector and similarity spaces, Proceedings of the Conference on Computational Natural Language Learning. Vol. 72. 2004. 

[20]Lin D, Automatic retrieval and clustering of similar words, Proc. of 17th International conference on Computational Linguistics, pp. 768-774, Morristown, USA, 1998.

[21]Diab M and Resnik P, An unsupervised method for word sense tagging using parallel corpora, Proc. of 40th Annual Meeting on Association for Computational Linguistics, ACL '02, pp. 255–262, Morristown, USA, 2002.

[22]Sarma SK, Medhi R, Gogoi M and Saikia U, Foundation and structure of developing Assamese WordNet, Proc. of 5th international conference of the Global WordNet Association (GWC 2010), 2010. 

[23]Barman AK, Sarmah Jumi and Sarma SK, Assamese WordNet based Quality Enhancement of Bilingual Machine Translation System, Proceedings of the Seventh Global Wordnet Conference, pp. 256-261, Estonia, 2014.

[24]Barman AK, Sarmah Jumi and Sarma SK, Automatic Assamese Text categorization using WordNet Proc. of International Conference on Advances in Computing, Communications and Informatics(ICACCI),pp. 85-89, Mysore, India, 2013.

[25]Mihalcea R, Knowledge based methods for WSD, Text, Speech and Language Technology, Vol. 33, pp. 107-132, Springer, Netherland, 2006.

[26]Lesk Michael, Automatic Sense Disambiguation Using Machine Readable Dictionaries: How to Tell a Pine Cone from an Ice Cream Cone, Proc. of 5th Annual International Conference on Systems Documentation, SIGDOC '86, pp 24–26, New York, 1986.

[27]Kalita Purabi, Barman AK, "Implementation of Walker Algorithm in Word Sense Disambiguation for Assamese Language"In Proceedings of IEEE IACC 2015, 14th -15th September 2015.

[28]Resnik P and Yarowsky D, "Distinguishing systems and distinguishing senses: new evaluation methods for word sense disambiguation". Nat. Lang. Eng., 5, pp. 113–133, 1999.