An Ensemble Model using a BabelNet Enriched Document Space for Twitter Sentiment Classification

Full Text (PDF, 429KB), PP.24-31

Views: 0 Downloads: 0

Author(s)

Semih Sevim 1,* Sevinc ilhan Omurca 1 Ekin Ekinci 1

1. Kocaeli University Computer Engineering Departmen, Kocaeli, TURKEY

* Corresponding author.

DOI: https://doi.org/10.5815/ijitcs.2018.01.03

Received: 22 Sep. 2017 / Revised: 22 Oct. 2017 / Accepted: 7 Nov. 2017 / Published: 8 Jan. 2018

Index Terms

Twitter sentiment classification, ensemble learning, Semantic enrichment, BabelNet

Abstract

With the widespread usage of social media in our daily lives, user reviews emerged as an impactful factor for numerous fields including understanding consumer attitudes, determining political tendency, revealing strengths or weaknesses of many different organizations. Today, people are chatting with their friends, carrying out social relations, shopping and following many current events through the social media. However social media limits the size of user messages. The users generally express their opinions by using emoticons, abbreviations, slangs, and symbols instead of words. This situation makes the sentiment classification of social media texts more complex. In this paper a sentiment classification model for Twitter messages is proposed to overcome this difficulty. In the proposed model first the short messages are expanded with BabelNet which is a concept network. Then the expanded and the original form of the messages are included in an ensemble learning model. Consequently we compared our ensemble model with traditional classification algorithms and observed that the F-measure value is increased.

Cite This Paper

Semih Sevim, Sevinç İlhan Omurca, Ekin Ekinci, "An Ensemble Model using a BabelNet Enriched Document Space for Twitter Sentiment Classification", International Journal of Information Technology and Computer Science(IJITCS), Vol.10, No.1, pp.24-31, 2018. DOI:10.5815/ijitcs.2018.01.03

Reference

[1]S. Sun, C. Luo and Junyu Chen, “A Review of Natural Language Processing Techniques for Opinion Mining Systems”, In Information Fusion, vol. 36, pp. 10-25, 2017. doi:10.1016/j.inffus.2016.10.004

[2]Z. Faguo, Z. Fan, Y. Bingru and Y. Xingang, "Research on Short Text Classification Algorithm Based on Statistics and Rules," 2010 Third International Symposium on Electronic Commerce and Security, Guangzhou, pp. 3-7, 2010.  doi:10.1109/ISECS.2010.9

[3]I. Taksa, S. Zelikovitz and A. Spink, "Using Web Search Logs to Identify Query Classification Terms," Information Technology, 2007. ITNG '07. Fourth International Conference on, Las Vegas, NV, pp. 469-474, 2007. doi: 10.1109/ITNG.2007.202

[4]Y. Zhou, B. Xu, J. Xu, L. Yang, C. Li and B. Xu, "Compositional Recurrent Neural Networks for Chinese Short Text Classification," 2016 IEEE/WIC/ACM International Conference on Web Intelligence (WI), Omaha, NE,pp.137-144, 2016. doi: 10.1109/WI.2016.0029

[5]M. Wang, L. Lin and F. Wang, “Improving Short Text Classification through Better Feature Space Selection”, 2013 Ninth International Conference on Computational Intelligence and Security, Leshan, pp. 120-124, 2013. doi: 10.1109/CIS.2013.32

[6]B. Sriram, D. Fuhry, E. Demir and M. Demirbas, “Short Text Classification in Twitter to Improve Information Filtering”, In Proceedings of the 33rd international ACM SIGIR conference on research and development in information retrieval, pp.841-842, 2010. doi:10.1145/1835449.1835643

[7]L. Wensen, C. Zewen, W. Jun and W. Xiaoyi, "Short Text Classification Based on Wikipedia and Word2vec", 2016 2nd IEEE International Conference on Computer and Communications (ICCC), Chengdu, pp. 1195-1200, 2016. doi: 10.1109/CompComm.2016.792489

[8]Q. Chen, L. Yao and J. Yang, "Short Text Classification Based on LDA Topic Model", 2016 International Conference on Audio, Language and Image Processing (ICALIP), Shanghai, pp.749-753, 2016. doi: 10.1109/ICALIP.2016.7846525

[9]L. Sang, F. Xie, X. Liu and X. Wu, "WEFEST: Word Embedding Feature Extension for Short Text Classification", 2016 IEEE 16th International Conference on Data Mining Workshops (ICDMW), Barcelona, pp. 677-683, 2016. doi: 10.1109/ICDMW.2016.0101

[10]A. Agarwal, B. Xie, I. Vovsha, O. Rambow and R. Passonneau, “Sentiment Analysis of Twitter Data Proceedings of the Workshop on Languages in Social Media”, Association for Computational Linguistics, Stroudsburg,  PA,  USA,  pp. 30-38,  2011.

[11]A. K. Singh, D. K. Gupta and R. M. Singh, “Sentiment Analysis of Twitter User Data on Punjab Legislative Assembly Election”, I.J. Modern Education and Computer Science, vol. 9, pp. 60-68, 2017. doi: 10.5815/ijmecs.2017.09.07

[12]I. Mukherjee, S. Sahana and P.K. Mahanti, “An Improved Information Retrieval Approach to Short Text Classification”, I.J. Information Engineering and Electronic Business, vol.4, pp. 31-37, 2017.  doi: 10.5815/ijieeb.2017.04.05

[13]D. Mumtaz and B. Ahuja, “A Lexical Approach for Opinion Mining in Twitter”, I.J. Education and Management Engineering, vol. 4, pp. 20-29, 2016. doi: 10.5815/ijeme.2016.04.03

[14]R. Xia, Chengqing Zong and Shoushan Li, “Ensemble of Feature Sets and Classification Algorithms for Sentiment Classification”, In Information Sciences, vol. 181(6), pp. 1138-1152, 2011. doi:10.1016/j.ins.2010.11.023

[15]W. Li, W. Wang and Y. Chen, “Heterogeneous Ensemble Learning for Chinese Sentiment Classification”, Journal of Information and Computational Science, 9(15), pp. 4551-4558,  2012.

[16]Y. Su, Y. Zhang, D. Ji, Y. Wang and H. Wu, “Ensemble Learning for Sentiment Classification”, 13th Chinese Conf. on Chinese Lexical Semantics (CLSW’12), Berlin, pp.84-93, 2012. doi:10.1007/978-3-642-36337-5_10Ava

[17]P.P.B. Filho and T.A.S. Pardo, “NILC USP: A Hybrid System for Sentiment Analysis in Twitter Messages”, Semeval 2013, Atlanta, Georgia, pp. 568–572,  2013.

[18]A. Hassan, A. Abbasi and D. Zeng, “Twitter Sentiment Analysis: a Bootstrap Ensemble Framework”, SocialCom, Alexandria, VA, pp. 357-364, 2013. doi: 10.1109/SocialCom.2013.56

[19]G. Wang, J. Sun, J. Ma, K. Xu and J. Gu, “Sentiment Classification: The Contribution of Ensemble Learning”, Decision Support Systems, vol. 57, pp. 77-93, 2014. doi: 10.1016/j.dss.2013.08.002

[20]E. Fersini, E. Messina and F.A. Pozzi, “Sentiment Analysis: Bayesian Ensemble Learning”, Decision Support Systems, vol. 68, pp. 26-38, 2014. doi:10.1016/j.dss.2014.10.004

[21]T. Chalothorn and J. Ellman, “Simple Approaches of Sentiment Analysis via Ensemble Learning”, Information Science and Applications Lecture Notes in Electrical Engineering, Information Science and Applications, pp. 631-639,  2015. doi: 10.1007/978-3-662-46578-3_74

[22]J.M. Cotelo, F.L. Cruz, F. Enríquez and J.A. Troyano, “Tweet Categorization by Combining Content and Structural Knowledge”, Information Fusion, vol. 31, pp.54-64, 2016. doi:10.1016/j.inffus.2016.01.002

[23]I. Perikos and I. Hatzilygeroudis, “Recognizing Emotions in Text Using Ensemble of Classifiers”, Engineering Applications of Artificial Intelligence, vol. 51, pp. 191-201, 2016. doi: 10.1016/j.engappai.2016.01.012

[24]E. A. Corrêa Jr., V. Q. Marinho and L. B. dos Santos, “NILC-USP at SemEval-2017 Task 4: A Multi-view Ensemble for Twitter Sentiment Analysis”, 2017. doi:abs/1704.02263,2017.

[25]http://www.hlt.utdallas.edu/~yangl/data/Text_Norm_Data_Release_Fei_Liu/, Accessed July 2017

[26]C. D Manning, M. Surdeanu, J. Bauer, J. Finkel, S. J Bethard, and D. McClosky, “The Stanford Corenlp Natural Language Processing Toolkit”, Proceedings of 52nd Annual Meeting of the Association for Computational Linguistics: System Demonstrations, pp. 55-60,  2014. doi: 10.3115/v1/P14-5010

[27]R. Navigli and S. P. Ponzetto, “BabelNet: The Automatic Construction, Evaluation and Application of a Wide-coverage Multilingual Semantic Network”, Artificial Intelligence, vol. 193, pp. 217-250, 2012. doi: 10.1016/j.artint.2012.07.001

[28]A. Moro, F. Cecconi and R. Navigli. “Multilingual Word Sense Disambiguation and Entity Linking for Everybody.”, Proc. of the 13th International Semantic Web Conference, Posters and Demonstrations (ISWC 2014), Riva del Garda, Italy, pp. 25-28,  2014.

[29]S. İ. Omurca, S. Baş and E. Ekinci, “An Efficient Document Categorization Approach for Turkish Based Texts”, International Journal Of Intelligent Systems And Applications In Engineering, 3(1), pp. 7-13, 2015. doi: 10.18201/ijisae.94177

[30]S. İ. Omurca and E. Ekinci, "An Alternative Evaluation of post Traumatic Stress Disorder with Machine Learning Methods," 2015 International Symposium on Innovations in Intelligent SysTems and Applications (INISTA), Madrid, pp. 1-7, 2015. doi: 10.1109/INISTA.2015.7276754

[31]H. Parveen and S. Pandey, "Sentiment Analysis on Twitter Data-set using Naive Bayes Algorithm," 2016 2nd International Conference on Applied and Theoretical Computing and Communication Technology (iCATccT), Bangalore, pp. 416-419, 2016. doi: 10.1109/ICATCCT.2016.7912034

[32]J. Jenkins, W. Nick, K. Roy, A. Esterline and J. Bloch, "Author identification using Sequential Minimal Optimization", SoutheastCon 2016, Norfolk, VA, pp. 1-2, 2016. doi: 10.1109/SECON.2016.7506654

[33]L. Yan, Y. Zhang, Y. He, S. Gao et al. “Hazardous Traffic Event Detection Using Markov Blanket and Sequential Minimal Optimization (MB-SMO)”, Sensors, Basel, Switzerland, 16(7) ,  2016. doi: 10.3390/s16071084

[34]Michael Gamon, “Sentiment classification on customer feedback data: noisy data, large feature vectors, and the role of linguistic analysis”, Proceedings of the 20th International Conference on Computational Linguistics, 2004. doi: 10.3115/1220355.1220476

[35]E. Ekinci and H. Takçı, “Elektronik Postaların Adli Analizinde Yazar Analizi Tekniklerinin Kullanılması”, 2012.

[36]P. C. S. Njølstad, L. S. Høysæter, W. Wei and J. A. Gulla, "Evaluating Feature Sets and Classifiers for Sentiment Analysis of Financial News", 2014 IEEE/WIC/ACM International Joint Conferences on Web Intelligence (WI) and Intelligent Agent Technologies (IAT), Warsaw, pp. 71-78, 2014. doi: 10.1109/WI-IAT.2014.82

[37]L. Rokach, “Ensemble-based classifiers”, Artificial Intelligence Review, vol. 33, pp. 1-39, 2010. doi:10.1007/s10462-009-9124-7

[38]T. Windeatt and G. Ardeshir, “Decision Tree Simplification For Classifier Ensembles”. International Journal of Pattern Recognition and Artificial Intelligence (IJPRAI), 18(5), pp.749-776, 2004.  doi:10.1142/S021800140400340X

[39]T. K. Ghosh J, “Error Correlation and Error Reduction in Ensemble Classifiers”, Connection science, special issue on combining artificial neural networks: ensemble approaches, 8(3-4), pp. 385-404, 1996. doi:10.1080/095400996116839

[40]X. Hu, "Using Rough Sets Theory and Database Operations to Construct a Good Ensemble of Classifiers for Data Mining Applications", Proceedings 2001 IEEE International Conference on Data Mining, San Jose, CA, pp. 233-240, 2001. doi: 10.1109/ICDM.2001.989524

[41]L. Rokach, “Ensemble Methods for Classifiers”, The Data Mining and Knowledge Discovery Handbook, pp. 957-980, 2005. doi: 10.1007/0-387-25465-X_45

[42]ZH. Zhou, “Ensemble Methods: Foundations and Algorithms”, Chapman & Hall, 2012.

[43]O. Araque, I. Corcuera-Platas, J. Fernando Sánchez-Rada and Carlos A. Iglesias, “Enhancing deep learning sentiment analysis with ensemble techniques in social applications”, In Expert Systems with Applications, vol. 77, pp. 236-246,  2017, doi: 10.1016/j.eswa.2017.02.00