Enhancing Sentiment Analysis for the 2024 Indonesia Election Using SMOTE-Tomek Links and Binary Logistic Regression

PDF (998KB), PP.22-32

Views: 0 Downloads: 0


Neny Sulistianingsih 1,* I Nyoman Switrayana 1

1. Department of Engineering, Universitas Bumigora, Mataram, Indonesia

* Corresponding author.

DOI: https://doi.org/10.5815/ijeme.2024.03.03

Received: 14 Oct. 2023 / Revised: 30 Dec. 2023 / Accepted: 24 Jan. 2024 / Published: 8 Jun. 2024

Index Terms

2024 Indonesia Election, Binary Logistic Regression, imbalanced data, sentiment analysis, SMOTE-Tomek Links, undersampling, oversampling


The Indonesian Election is one of the most anticipated political contestations among the Indonesian people. Mainly because the results of the Indonesian Election are leaders in Indonesia ranging from governors and legislative members to the president and vice president of Indonesia, who will lead the next five years, considering the importance of the five-year agenda, the dissemination of good information about work programs, the activities of prospective leaders who will elect in the 2024 election and various news stories are starting to spread on Twitter. Based on this, this research aims to analyze public sentiment on Twitter wa The research method used is SMOTE-Tomek Links to overcome imbalanced data. In contrast, sentiment analysis uses Binary Logistic Regression. Evaluation related to this model measures accuracy and ROC Curves. The evaluation results show that the SMOTE-Tomek Links method is less than optimal for the data used in the research, namely the 2024 election data, with an accuracy value of 0.581 for training data and 0.406 for testing data. Undersampling methods such as Tomek Links and Random (undersampling) show higher values when combined with Binary Logistic Regression in analyzing the sentiment produced in this study, namely 0.983 and 0.938 for the Tomek Links method and 0.964 and 0.902 for the Random (undersampling) method, respectively -each for training and testing data.

Cite This Paper

Neny Sulistianingsih, I Nyoman Switrayana, "Enhancing Sentiment Analysis for the 2024 Indonesia Election Using SMOTE-Tomek Links and Binary Logistic Regression", International Journal of Education and Management Engineering (IJEME), Vol.14, No.3, pp. 22-32, 2024. DOI:10.5815/ijeme.2024.03.03


[1]M. N. Habibi and Sunjana, "Analysis of Indonesia Politics Polarization before 2019 President Election Using Sentiment Analysis and Social Network Analysis," Int. J. Mod. Educ. Comput. Sci., vol. 11, no. 11, pp. 22–30, 2019.
[2]I. Kaibi, "A Comparative Evaluation of Word Embeddings Techniques for Twitter Sentiment Analysis," in 2019 International Conference on Wireless Technologies, Embedded and Intelligent Systems (WITS), 2019, pp. 31–34.
[3]A. M. Rajeswari, M. Mahalakshmi, R. Nithyashree, and G. Nalini, "Sentiment Analysis for Predicting Customer Reviews using a Hybrid Approach," in Advanced Computing and Communication Technologies for High-Performance Applications, 2020, pp. 200–205.
[4]Z. Yuan and L. Duan, "Construction method of sentiment lexicon based on word2vec," Proc. 2019 IEEE 8th Jt. Int. Inf. Technol. Artif. Intell. Conf. ITAIC 2019, no. Itaic, pp. 848–851, 2019, doi: 10.1109/ITAIC.2019.8785471.
[5]R. Othman, Y. Abdelsadek, K. Chelghoum, I. Kacem, and R. Faiz, "Improving Sentiment Analysis in Twitter Using Sentiment Specific Word Embeddings," Proc. 2019 10th IEEE Int. Conf. Intell. Data Acquis. Adv. Comput. Syst. Technol. Appl. IDAACS 2019, vol. 2, pp. 854–858, 2019, doi: 10.1109/IDAACS.2019.8924403.
[6]M. Chiny, M. Chihab, Y. Chihab, and O. Bencharef, "LSTM, VADER and TF-IDF based Hybrid Sentiment Analysis Model," Int. J. Adv. Comput. Sci. Appl., vol. 12, no. 7, pp. 265–275, 2021, doi: 10.14569/IJACSA.2021.0120730.
[7]R. S. Kharisma, Muttafi'ah, and A. Dahlan, "Comparison of Naïve Bayes Algorithm Model Combinations with Term Weighting Techniques in Sentiment Analysis," ICOIACT 2021 - 4th Int. Conf. Inf. Commun. Technol. Role AI Heal. Soc. Revolut. Turbul. Era, pp. 160–163, 2021, doi: 10.1109/ICOIACT53268.2021.9563999.
[8]P. S. Reddy, D. Renu Sri, C. S. Reddy, and S. Shaik, "Sentimental Analysis using Logistic Regression," Int. J. Eng. Res. Appl. www.ijera.com, vol. 11, no. 7, pp. 36–40, 2021, doi: 10.9790/9622-1107023640.
[9]M. Sharma, "Polarity Detection in a Cross-Lingual Sentiment Analysis using spaCy," ICRITO 2020 - IEEE 8th Int. Conf. Reliab. Infocom Technol. Optim. (Trends Futur. Dir., pp. 490–496, 2020, doi: 10.1109/ICRITO48877.2020.9197829.
[10]P. Nandwani and R. Verma, "A review on sentiment analysis and emotion detection from text," Soc. Netw. Anal. Min., vol. 11, no. 1, 2021, doi: 10.1007/s13278-021-00776-6.
[11]A. Ibrahim, "Forecasting the Early Market Movement in Bitcoin Using Twitter's Sentiment Analysis : An Ensemble-based Prediction Model," in IoT, Electronics and Mechatronics Conference (IEMTRONICS), IEEE International, 2021, pp. 1–5.
[12]S. Verma et al., "Improving and Analyzing the Movie Sentiments Using the SVM Approach," in 2022 IEEE Conference on Interdisciplinary Approaches in Technology and Management for Social Innovation (IATMSI), 2022, pp. 1–6.
[13]S. Shinan et al., "Screening of Zero-value Insulators Infrared Thermal Image Features Based on Binary Logistic Regression Analysis," in 2018 2nd IEEE Conference of Energy System Integration, 2018, vol. 2, pp. 10–13.
[14]R. WP, A. Novianty, and C. Setianingsih, "Sentiment analysis using multinomial logistic regression," in The 2017 International Conference of Control, Electronics, Renewable Energy and Communications, 2017, pp. 46–49.
[15]A. Yavari, H. Hassanpour, B. R. Cami, and M. Mahdavi, "Election Prediction Based on Sentiment Analysis using Twitter Data," Int. J. Eng. Trans. B Appl., vol. 35, no. 2, pp. 372–379, 2022, doi: 10.5829/ije.2022.35.02b.13.
[16]M. Rodriguez-Ibanez, F. J. Gimeno-Blanes, P. M. Cuenca-Jimenez, C. Soguero-Ruiz, and J. L. Rojo-Alvarez, "Sentiment Analysis of Political Tweets from the 2019 Spanish Elections," IEEE Access, vol. 9, pp. 101847–101862, 2021, doi: 10.1109/ACCESS.2021.3097492.
[17]G. Alvarez, J. Choi, and S. Strover, "Good News, Bad News: A Sentiment Analysis of the 2016 Election Russian Facebook Ads," Int. J. Commun., vol. 14, pp. 3027–3053, 2020.
[18]M. Caballero, "Predicting the 2020 US Presidential Election with Twitter," pp. 53–65, 2021, doi: 10.5121/csit.2021.111006.
[19]A. Ria Devina Endsuy, "Sentiment Analysis between VADER and EDA for the US Presidential Election 2020 on Twitter Datasets," J. Appl. Data Sci., vol. 2, no. 1, pp. 8–18, 2021, doi: 10.47738/jads.v2i1.17.
[20]H. N. Chaudhry et al., "Sentiment analysis of before and after elections: Twitter data of US election 2020," Electron., vol. 10, no. 17, pp. 1–26, 2021, doi: 10.3390/electronics10172082.
[21]R. Sandoval-Almazan and D. Valle-Cruz, "Sentiment Analysis of Facebook Users Reacting to Political Campaign Posts," Digit. Gov. Res. Pract., vol. 1, no. 2, pp. 1–13, 2020, doi: 10.1145/3382735.
[22]U. Yaqub, N. Sharma, R. Pabreja, S. A. Chun, V. Atluri, and J. Vaidya, "Location-based Sentiment Analyses and Visualization of Twitter Election Data," Digit. Gov. Res. Pract., vol. 1, no. 2, pp. 1–19, 2020, doi: 10.1145/3339909.
[23]M. Zeeshan, M. B. Aziz, M. O. Siddiqui, H. Mehra, and K. P. Singh, "Analysis of Political Sentiment Orientations on Twitter," in International Conference on Computational Intelligence and Data Science, 2020, vol. 167, pp. 1821–1828.
[24]N. V. Chawla, "Data Mining for Imbalanced Datasets: An Overview," in Data Mining and Knowledge Discovery Handbook, Springer, 2010, pp. 875–886.
[25]L. Yu and N. Zhou, "Survey of Imbalanced Data Methodologies," ArXiv, pp. 1–7, 2021.
[26]G. D. Garson, Logistic Regression: Binary & Multinomial. Statistical Assosiates Publishing, 2014.
[27]K. McGarry, A Survey of Interestingness Measures for Knowledge Discovery, vol. 00. 2005.