Sentiment Analysis of Review Datasets Using Naïve Bayes‘ and K-NN Classifier

Full Text (PDF, 843KB), PP.54-62

Views: 0 Downloads: 0

Author(s)

Lopamudra Dey 1,* Sanjay Chakraborty 2 Anuraag Biswas 1 Beepa Bose 1 Sweta Tiwari 1

1. Department of Computer Science & Engineering, Heritage Institute of Technology, Kolkata, India

2. Department of Computer Science & Engineering, Institute of Engineering & Management, Kolkata, India

* Corresponding author.

DOI: https://doi.org/10.5815/ijieeb.2016.04.07

Received: 9 Apr. 2016 / Revised: 3 May 2016 / Accepted: 10 Jun. 2016 / Published: 8 Jul. 2016

Index Terms

Sentiment Analysis, Naïve Bayes‘, K-NN, Supervised Machine Learning, Text Mining

Abstract

The advent of Web 2.0 has led to an increase in the amount of sentimental content available in the Web. Such content is often found in social media web sites in the form of movie or product reviews, user comments, testimonials, messages in discussion forums etc. Timely discovery of the sentimental or opinionated web content has a number of advantages, the most important of all being monetization. Understanding of the sentiments of human masses towards different entities and products enables better services for contextual advertisements, recommendation systems and analysis of market trends. The focus of our project is sentiment focussed web crawling framework to facilitate the quick discovery of sentimental contents of movie reviews and hotel reviews and analysis of the same. We use statistical methods to capture elements of subjective style and the sentence polarity. The paper elaborately discusses two supervised machine learning algorithms: K-Nearest Neighbour(K-NN) and Naïve Bayes‘ and compares their overall accuracy, precisions as well as recall values. It was seen that in case of movie reviews Naïve Bayes‘ gave far better results than K-NN but for hotel reviews these algorithms gave lesser, almost same accuracies.

Cite This Paper

Lopamudra Dey, Sanjay Chakraborty, Anuraag Biswas, Beepa Bose, Sweta Tiwari, "Sentiment Analysis of Review Datasets Using Naïve Bayes' and K-NN Classifier", International Journal of Information Engineering and Electronic Business(IJIEEB), Vol.8, No.4, pp.54-62, 2016. DOI:10.5815/ijieeb.2016.04.07

Reference

[1]Lina L. Dhande and Dr. Prof. Girish K. Patnaik, "Analyzing Sentiment of Movie Review Data using Naive Bayes Neural Classifier", IJETTCS, Volume 3, Issue 4 July-August 2014, ISSN 2278-6856.

[2]P.Kalaivani, "Sentiment Classification of Movie Reviews by supervised machine learning approaches", Indian Journal of Computer Science and Engineering (IJCSE) ISSN : 0976-5166 Vol. 4 No.4 Aug-Sep 2013 285.

[3]Meena Rambocas, João Gama, "Marketing Research: The Role of Sentiment Analysis", n. 489 April 2013, ISSN: 0870-8541.

[4]Weiguo Fan, Linda Wallace, Stephanie Rich, and Zhongju Zhang, (2005), "Tapping into the Power of Text Mining", Journal of ACM, Blacksburg.

[5]"Movie review dataset," [Online]. Available http://www.cs.cornell.edu/people/pabo/movie-review-data/, [Accessed: October 2013].

[6]K. M. Leung, "Naive Bayesian classifier," [Online] Available:http://www.sharepdf.com/81fb247fa7c54680a94dc0f3a253fd85/naiveBayesianClassifier.pdf, [Accessed: September 2013].

[7]Zhou Yong, Li Youwen and Xia Shixiong"An Improved KNN Text Classification Algorithm Based on Clustering", journal of computers, vol. 4, no. 3, march 2009.

[8]G.Vinodhini, RM. Chandrasekaran "Sentiment Analysis and Opinion Mining: A Survey", International journal of advanced research in computer science and software engineering, Volume 2, Issue 6, June 2012 ISSN: 2277 128X.

[9]Rudy Prabowo1, Mike Thelwall"Sentiment Analysis: A Combined Approach",Journal of Informatics, 3(1):143–157, 2009.

[10]Walaa Medhat a, Ahmed Hassan, Hoda Korashy, "Sentiment analysis algorithms and applications: A survey", Ain Shams Engineering Journal (2014) 5, 1093–1113.

[11]Svetlana Kiritchenko, Xiaodan Zhu, Saif M. Mohammad, "Sentiment Analysis of Short Informal Texts", Journal of Artificial Intelligence Research (2014): 723-762. 

[12]Jusoh, Shaidah, and Hejab M. Alfawareh. "Techniques, applications and challenging issue in text mining."International Journal of Computer Science Issues(IJCSI) 9, no. 6 (2012).

[13]Eniafe Festus Ayetiran, Adesesan Barnabas Adeyemo, "A Data Mining-Based Response Model for Target Selection in Direct Marketing", I.J. Information Technology and Computer Science, 2012, vol.1, pp 9-18, DOI:10.5815/ijitcs.2012.01.02.

[14]Saptarsi Goswami, Amlan Chakrabarti, "Feature Selection: A Practitioner View", I.J. Information Technology and Computer Science, 2014, vol. 11, pp 66-77.

[15]L. Dey and S. Chakraborty, "Canonical PSO Based K-Means Clustering Approach for Real Datasets",International Scholarly Research Notices, Hindawi Publishing Corporation,Vol.2014,pp.1-11,2014. 

[16]R. Dey and S. Chakraborty, "Convex-hull & DBSCAN clustering to predict future weather", 6th International IEEE Conference and Workshop on Computing and Communication, Canada, 2015, pp.1-8.