Ogunsuyi Opeyemi J.; Adebola K. OJO

K-Nearest Neighbors Bayesian Approach to False News Detection from Text on Social Media

Full Text (PDF, 893KB), PP.22-32

Views: 0 Downloads: 0

Author(s)

Ogunsuyi Opeyemi J. ^1,* Adebola K. OJO ¹

1. Department of Computer Science, University of Ibadan, Ibadan, Nigeria

* Corresponding author.

DOI: https://doi.org/10.5815/ijeme.2022.04.03

Received: 28 Feb. 2022 / Revised: 26 May 2022 / Accepted: 24 Jun. 2022 / Published: 8 Aug. 2022

Index Terms

False News/Information Detection, K-Nearest Neighbours, Bayesian, Word2Vector, Term Frequency-Inverse Document Frequency.

Abstract

Social media usage has increased due to the rate at which technologies are emerging and it is less likely to detect false news/information manually as it aims to capture the human mind. The spread of false news can cause havoc; therefore, detection of false news becomes paramount where almost everyone has access to social media. Our proposed system optimizes the false news detection process. The system combines advantages of two textual feature extraction methods and two machine learning algorithms for text classification. Basic pre-processing methods were employed. Feature extraction was carried out using Term Frequency-Inverse Document Frequency with Word2Vector. K-Nearest Neighbour (KNN) and Naïve Bayes (NB) algorithms are combined to give KNN Bayesian. The most available systems made use of a single feature extraction method but in our system, two feature extraction methods are combined. The evaluation metrics used were accuracy, precision, recall, f1score and KNN Bayesian performed better than KNN. To further evaluate our model, the Area under the Curve-Receiver Operator Characteristics (AUC-ROC) revealed that AUC of KNN Bayesian ROC curve is higher than that of KNN.

Cite This Paper

Ogunsuyi Opeyemi J., Adebola K. OJO, "K-Nearest Neighbors Bayesian Approach to False News Detection from Text on Social Media", International Journal of Education and Management Engineering (IJEME), Vol.12, No.4, pp. 22-32, 2022. DOI:10.5815/ijeme.2022.04.03

Reference

[1]Zhang, Y., Su, Y., Weigang, L. and Liu, H. (2018). Rumor and authoritative information propagation model considering super spreading in complex social networks. Physica A: Statistical Mechanics and Its Applications, 506, pp. 395-411.

[2]Conroy, N. J., Rubin, V. L. and Chen, Y. (2015). Automatic deception detection: Methods for finding fake news. Proceedings of the Association for Information Science and Technology, 52(1), pp. 1-4.

[3]BBC News. (2015). Nigeria storm over social media bill. [online] Available at: www.bbc.com/news/world-africa-35005137.amp (Accessed 7 January 2022).

[4]Hassan, I. (2020) The other COVID-19 pandemic: Fake news | African Arguments. [online] Available at: www.africanarguments.org/2020/03/the-other-covid-19-pandemic-fake-news (Accessed 16 June 2021).

[5]Goel, N. (2020). A study of text mining techniques: Applications and Issues. Pramana Research Journal, 8(12), pp. 307–316.

[6]Weiss, S. M., Indurkhya, N., Zhang, T. and Damerau, F. (2010). Text mining: predictive methods for analyzing unstructured information. Springer Science and Business Media.

[7]Fan, W., Wallace, L., Rich, S. and Zhang, Z. (2006). Tapping the power of text mining. Communications of the ACM, 49(9), pp. 76-82.

[8]Navathe, S. B. and Elmasri, R. (2000). Data Warehousing and Data Mining. In Fundamentals of Database System, Pearson Education pvt Inc, Singapore, pp. 841-872.

[9]Sumathy, K. and Chidambaram, M. (2013). Text mining: Concepts, applications, tools and issues-an overview. International Journal of Computer Applications, 80(4), pp. 29-32.

[10]He, W. (2013). Examining students’ online interaction in a live video streaming environment using data mining and text mining. Computers in Human Behavior, 29(1), pp. 90-102.

[11]Zhou, X., Gururajan, R., Li, Y., Venkataraman, R., Tao, X., Bargshady, G., Barua, P. D. and Kondalsamy-Chennakesavan, S. (2020). A survey on text classification and its applications. Web Intelligence, 18(3), pp. 205-216.

[12]McCornack, S. A., Morrison, K., Paik, J. E., Wisner, A. M. and Zhu, X. (2014). Information Manipulation Theory 2: A Propositional Theory of Deceptive Discourse Production. Journal of Language and Social Psychology, 33(4), pp. 348-377.

[13]Zuckerman, M., DePaulo, B. M. and Rosenthal, R. (1981). Verbal and Nonverbal Communication of Deception. Advances in experimental social psychology, 14, pp. 1–59.

[14]Drif, A., Ferhat Hamida, Z. and Giordano, (2019). Fake News Detection Method Based on Text-Features. The Ninth International Conference on Advances in Information Mining and Management, pp. 26-31.

[15]Pérez-Rosas, V., Kleinberg, B., Lefevre, A. and Mihalcea, R. (2018). Automatic detection of fake news. COLING 2018 - 27th International Conference on Computational Linguistics Proceedings.

[16]Fürnkranz, J. (1998). A study using n-gram features for text categorization. Austrian Research Institute for Artificial Intelligence.

[17]Ahmed, H., Traore, I. and Saad, S. (2017). Detection of Online Fake News Using N-Gram Analysis and Machine Learning Techniques. Lecture Notes in Computer Science (Including Subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics), 10618 LNCS, pp. 127–138.

[18]Ahmed, H., Traore, I. and Saad, S. (2017). Detecting opinion spams and fake news using text classification. Security and Privacy, 1(1), p. e9.

[19]Kaur, P., Boparai, S. and Singh, D. (2019). Hybrid Text Classification Method for Fake News Detection. International Journal of Engineering and Advanced Technology (IJEAT), (5).

[20]Jain, G. and Mudgal, A. (2019). Natural language Processing Based Fake News Detection using Text Content Analysis with LSTM. International Journal of Advanced Research in Computer and Communication Engineering, 8(11).

[21]Bharadwaj, P. and Shao, Z. (2019). Fake News Detection with Semantic Features and Text Mining. International Journal on Natural Language Computing, 8(3), pp. 17–22.

[22]Sriram, S. (2020). An Evaluation of Text Representation Techniques for Fake News Detection Using: TF-IDF, Word Embeddings, Sentence Embeddings with Linear Support Vector Machine.

[23]Ozbay, F. A. and Alatas, B. (2020). Fake news detection within online social media using supervised artificial intelligence algorithms. Physica A: Statistical Mechanics and Its Applications, 540.

[24]Yazdi, K. M., Yazdi, A. M., Khodayi, S., Hou, J., Zhou, W. and Saedy, S. 2020. (2020). Improving Fake News Detection Using K-means and Support Vector Machine Approaches. World Academy of Science, Engineering and Technology International Journal of Electronics and Communication Engineering, 14(2), pp. 38-42.

International Journal of Education and Management Engineering (IJEME)