Analysis of Labeling and Class-Balancing Effects on Clash of Champions Sentiment Using LSTM and BERT

  • Audi Ilham Atmaja Universitas Muhammadiyah Magelang, Indonesia
  • Maimunah Maimunah Universitas Muhammadiyah Magelang, Indonesia
  • Pristi Sukmasetya Universitas Muhammadiyah Magelang, Indonesia
Keywords: Clash of Champions, Sentiment Analysis, VADER, SMOTE, BERT

Abstract

Advances in digital technology have changed the way people interact and access information, including in education. One educational event that has caught the public's attention is Clash of Champions by Ruangguru, designed to increase young people's interest in learning through an interactively presented competition. The purpose of this study is to use posts on X social media to examine public opinion on the event. Using TweetHarvest, 1,891 tweets were gathered and preprocessed (cleaning, case folding, normalization, tokenization, stopword removal, stemming, and English translation). A total of 12 experimental scenarios were created by combining VADER and TextBlob labeling strategies with class balancing techniques (undersampling and SMOTE), and the LSTM and BERT models were evaluated for each scenario. The best results were achieved by combining VADER, SMOTE, and BERT, yielding an accuracy of 97.73%, with precision, recall, and F1-scores of 98%, 98%, and 96% (positive), 99% (neutral), and 98% (negative), respectively. These findings highlight the efficacy of transformer-based models like BERT in addressing class imbalance and improving sentiment classification. The integration of SMOTE effectively mitigated class imbalance, providing consistent and accurate performance across all sentiment categories.

Downloads

Download data is not yet available.

References

S. Adha, “Faktor Revolusi Perilaku Konsumen Era Digital : Sebuah Tinjauan Literatur,” Jipis, vol. 31, no. 2, pp. 134–148, 2022, doi: 10.33592/jipis.v31i2.3286.

S. A. Zaxrie, N. Rina, S. Thoibah, K. Putri, and M. Sosial, “Peran Media Sosial Sebagai Saluran Komunikasi Digital Dalam Kehumasan The Role of Social Media As A Digital,” vol. 3, no. 1, pp. 26–39, 2024.

A. P. Anggraini and F. U. Najicha, “Pengembangan Wawasan Nusantara Sebagai Muatan Pendidikan Kewarganegaraan Generasi Muda Melalui Pemanfaatan Internet,” J. Pendidik. Ilmu Pengetah. Sos., vol. 14, no. 1, pp. 174–180, 2022, doi: 10.37304/jpips.v14i1.4747.

F. A. R. Manurung, A. Padriansyah, E. R. A. Sitorus, M. Hasanah, and K. Saleh, “The Effect of Smartphone Use on Academic Achievement and Social Skills of Elementary School Students: Case Study at UPTD SDN 10 Bangun Sari Village,” J. IPTEK Bagi Masy., vol. 4, no. 1, 2024.

T. M. Assalamah, “Strategi Viral Marketing Melalui Konten Edutainment Clash of Champions by Ruangguru Viral Marketing Strategy Through Edutainment Content Clash of Champions by Ruangguru,” vol. 7, no. 3, doi: 10.1080/019722409032.....154.

K. A. Sari and G. Reftantia, “Hegemoni ‘ Clash of Champions Ruang Guru ’ di Tengah Maraknya Tayangan Non Edukatif,” pp. 130–145, 2024.

M. Hudha, E. Supriyati, and T. Listyorini, “Analisis Sentimen Pengguna Youtube Terhadap Tayangan #Matanajwamenantiterawan Dengan Metode Naïve Bayes Classifier,” JIKO (Jurnal Inform. dan Komputer), vol. 5, no. 1, pp. 1–6, 2022, doi: 10.33387/jiko.v5i1.3376.

S. Helmiyah and A. Verdian, “Analisis Sentimen Terhadap Minat Belajar pada Tayangan Acara CoC by Ruangguru Berdasarkan Tweets Menggunakan Metode NLP dan Model BERT: Analisis Sentimen Terhadap Minat Belajar pada Tayangan Acara CoC by Ruangguru Berdasarkan Tweets Menggunakan Metode NLP,” J. Pendidik. Rosalia, vol. 7, no. 2, pp. 138–149, 2024.

W. Maharani, “Sentiment Analysis during Jakarta Flood for Emergency Responses and Situational Awareness in Disaster Management using BERT,” 2020 8th Int. Conf. Inf. Commun. Technol. ICoICT 2020, 2020, doi: 10.1109/ICoICT49345.2020.9166407.

M. Tripathi, “Sentiment Analysis of Nepali COVID19 Tweets Using NB, SVM AND LSTM,” J. Artif. Intell. Capsul. Networks, vol. 3, no. 3, pp. 151–168, 2021, doi: 10.36548/jaicn.2021.3.001.

B. M. Alenzi, M. B. Khan, M. H. A. Hasanat, A. K. J. Saudagar, M. Alkhathami, and A. Altameem, “Automatic Annotation Performance of TextBlob and VADER on Covid Vaccination Dataset,” Intell. Autom. Soft Comput., vol. 34, no. 2, pp. 1311–1331, 2022, doi: 10.32604/iasc.2022.025861.

F. Illia, M. P. Eugenia, and S. A. Rutba, “Sentiment Analysis on PeduliLindungi Application Using TextBlob and VADER Library,” Proc. Int. Conf. Data Sci. Off. Stat., vol. 2021, no. 1, pp. 278–288, 2022, doi: 10.34123/icdsos.v2021i1.236.

S. N. Almuayqil, M. Humayun, N. Z. Jhanjhi, M. F. Almufareh, and N. A. Khan, “Enhancing Sentiment Analysis via Random Majority Under-Sampling with Reduced Time Complexity for Classifying Tweet Reviews,” Electron., vol. 11, no. 21, 2022, doi: 10.3390/electronics11213624.

E. R. N. Mustaqim, U. Pagalay, and ..., “Prediksi Tingkat Kepercayaan Masyarakat Terhadap Pilpres 2024 Menggunakan Tf-Idf Dan Bow Menggunakan Metode Svm,” Mandalika, pp. 515–530, 2024.

N. Mardiah, L. Marlina, Z. Sitorus, and M. Iqbal, “Analysis Of Indonesian People ’ s Sentiment Towards 2024 Presidential Candidates On Social Media Using Naïve Bayes Classifier and Support Vector Machine,” vol. 6, no. 2, pp. 950–960, 2024, doi: 10.47065/bits.v6i2.5766.

E. Priansyah and T. Sutabri, “IJM: Indonesian Journal of Multidisciplinary Analisis Sentimen Berbasis Naïve Bayes Pada Media Sosial Twitter Terhadap Hasil Pemilu Indonesia 2024,” IJM Indones. J. Multidiscip., vol. 2, pp. 128–138, 2024.

A. R. Isnain, H. Sulistiani, B. M. Hurohman, A. Nurkholis, and S. Styawati, “Analisis Perbandingan Algoritma LSTM dan Naive Bayes untuk Analisis Sentimen,” J. Edukasi dan Penelit. Inform., vol. 8, no. 2, p. 299, 2022, doi: 10.26418/jp.v8i2.54704.

Muhammad Fernanda Naufal Fathoni, Eva Yulia Puspaningrum, and Andreas Nugroho Sihananto, “Perbandingan Performa Labeling Lexicon InSet dan VADER pada Analisa Sentimen Rohingya di Aplikasi X dengan SVM,” Modem J. Inform. dan Sains Teknol., vol. 2, no. 3, pp. 62–76, 2024, doi: 10.62951/modem.v2i3.112.

D. B. A. N. Puspita Sari, “Analisis Perbandingan Sentimen Pengguna Twitter Terhadap Layanan Salah Satu Provider Internet Di Indonesia Menggunakan Metode Klasifikasi,” Tematik, vol. 10, no. 2, pp. 246–251, 2022.

Diana Dwi Rahayu, Muhammad Fatchan, and Alfonsus Ligouri, “Analisis Sentimen Twitter Terpilihnya Prabowo - Gibran Menggunakan Metode Neural Network,” Tematik, vol. 11, no. 1, pp. 85–91, 2024, doi: 10.38204/tematik.v11i1.1943.

S. Z. Rozaan, M. R. Andrianto, R. S. Purnama, N. I. Ramadhan, and W. Putra, “Analisis Sentimen terhadap Kenaikan UKT di Indonesia pasca Terpilihnya Capres 02 menggunakan VADER,” vol. 4, pp. 88–92, 2024.

M. Raees and S. Fazilat, “Lexicon-Based Sentiment Analysis on Text Polarities with Evaluation of Classification Models,” pp. 1–18, 2024.

V. Vinardo and I. Wasito, “Two-Stage Sentiment Analysis on Indonesian Online News Using Lexicon-Based,” Sinkron, vol. 8, no. 4, pp. 2109–2119, 2023, doi: 10.33395/sinkron.v8i4.12769.

N. Husin, “Komparasi Algoritma Random Forest, Naïve Bayes, dan Bert Untuk Multi-Class Classification Pada Artikel Cable News Network (CNN),” J. Esensi Infokom J. Esensi Sist. Inf. dan Sist. Komput., vol. 7, no. 1, pp. 75–84, 2023, doi: 10.55886/infokom.v7i1.608.

W. I. Sabilla and C. Bella Vista, “Implementasi SMOTE dan Under Sampling pada Imbalanced Dataset untuk Prediksi Kebangkrutan Perusahaan,” J. Komput. Terap., vol. 7, no. 2, pp. 329–339, 2021, doi: 10.35143/jkt.v7i2.5027.

L. D. Cahya, A. Luthfiarta, J. I. T. Krisna, S. Winarno, and A. Nugraha, “Improving Multi-label Classification Performance on Imbalanced Datasets Through SMOTE Technique and Data Augmentation Using IndoBERT Model,” J. Nas. Teknol. dan Sist. Inf., vol. 9, no. 3, pp. 290–298, 2024, doi: 10.25077/teknosi.v9i3.2023.290-298.

V. Rupapara, F. Rustam, H. F. Shahzad, A. Mehmood, I. Ashraf, and G. S. Choi, “Impact of SMOTE on Imbalanced Text Features for Toxic Comments Classification Using RVVC Model,” IEEE Access, vol. 9, pp. 78621–78634, 2021, doi: 10.1109/ACCESS.2021.3083638.

M. Mujahid et al., “Sentiment analysis and topic modeling on tweets about online education during covid-19,” Appl. Sci., vol. 11, no. 18, 2021, doi: 10.3390/app11188438.

A. Alim Murtopo, M. Aditdya, P. Septiana Ananda, and G. Gunawan, “Penerapan Computer Vision Untuk Mendeteksi Kelengkapan Atribut Siswa Menggunakan Metode Cnn,” PROSISKO J. Pengemb. Ris. dan Obs. Sist. Komput., vol. 11, no. 2, pp. 247–258, 2024, doi: 10.30656/prosisko.v11i2.8752.

D. Melati, Herfina, and Mulyati, “Penerapan Metode Long Short-Term Memory (LSTM) dalam Analisis Sentimen terhadap Pelaksanaan Pilkada di Masa Pandemi COVID-19,” J. Inform. dan Komput., vol. 22, no. 1, pp. 22–28, 2024, doi: 10.35508/jicon.v12i1.9899.

M. P. Geetha and D. Karthika Renuka, “Improving the performance of aspect based sentiment analysis using fine-tuned Bert Base Uncased model,” Int. J. Intell. Networks, vol. 2, no. June, pp. 64–69, 2021, doi: 10.1016/j.ijin.2021.06.005.

N. Puspitasari, A. Septiarini, and A. R. Aliudin, “Metode K-Nearest Neighbor Dan Fitur Warna Untuk Klasifikasi Daun Sirih Berdasarkan Citra Digital,” PROSISKO J. Pengemb. Ris. dan Obs. Sist. Komput., vol. 10, no. 2, pp. 165–172, 2023, doi: 10.30656/prosisko.v10i2.6924.

A. Syahril et al., “Perbandingan Metode Decision Tree Dan K-Nearest Neighbor Terhadap Ulasan Pengguna Aplikasi Mypertamina Menggunakan Confusion Matrix,” J. Inf. Syst. Res., vol. 5, no. 4, pp. 1085–1094, 2024, doi: 10.47065/josh.v5i4.5639.

Published
2024-12-31
Abstract views: 216 times
Download PDF: 134 times
How to Cite
Atmaja, A., Maimunah, M., & Sukmasetya, P. (2024). Analysis of Labeling and Class-Balancing Effects on Clash of Champions Sentiment Using LSTM and BERT. Journal of Information Systems and Informatics, 6(4), 2868-2891. https://doi.org/10.51519/journalisi.v6i4.929
Section
Articles