Shielding Social Media: BERT and SVM Unite for Cyberbullying Detection and Classification
Abstract
This paper presents a novel approach for cyberbullying detection and classification in social media text using an ensemble model that combines BERT (Bidirectional Encoder Representations from Transformers) and Support Vector Machine (SVM) with grid search for multiclass classification. We have also compared the performance of our proposed with various machine and deep learning models and the results show that our proposed model outperforms other models achieving an accuracy of 90% on testing data. Further, we have used to used SHapley Additive exPlanations (SHAP) an Explainable (XAI) technique to interpret the predictions of the BERT-SVM ensemble model.
Downloads
References
D. Mukhopadhyay, K. Mishra, K. Mishra, and L. B. Tiwari, “Cyber bullying detection based on Twitter dataset,” in Lecture notes in networks and systems (Online), 2020, pp. 87–94. doi: 10.1007/978-981-15-7106-0_9.
P. Sangani, “Cyberbullying in children more widespread in India than elsewhere,” The Economic Times, Sep. 04, 2022.
O. Djuraskovic, “Cyberbullying Statistics, Facts, and Trends (2023) with Charts,” First Site Guide, Jun. 2023.
S. Neelakandan et al., “Deep learning approaches for cyberbullying detection and classification on social media,” Computational Intelligence and Neuroscience, vol. 2022, pp. 1–13, Jun. 2022, doi: 10.1155/2022/2163458.
A. Perera and P. Fernando, “Accurate cyberbullying detection and prevention on social media,” Procedia Computer Science, vol. 181, pp. 605–611, Jan. 2021, doi: 10.1016/j.procs.2021.01.207.
R. Zhao, A. Zhou, and K. Mao, “Automatic detection of cyberbullying on social networks based on bullying features,” Int. Conf. Distrib. Comput. Netw., Jan. 2016, doi: 10.1145/2833312.2849567.
V. Balakrishnan, S. Khan, T. Fernandez, and H. R. Arabnia, “Cyberbullying detection on twitter using Big Five and Dark Triad features,” Personality and Individual Differences, vol. 141, pp. 252–257, Apr. 2019, doi: 10.1016/j.paid.2019.01.024.
R. R. Dalvi, S. Baliram Chavan and A. Halbe, "Detecting A Twitter Cyberbullying Using Machine Learning," 2020 4th International Conference on Intelligent Computing and Control Systems (ICICCS), Madurai, India, 2020, pp. 297-301, doi: 10.1109/ICICCS48265.2020.9120893.
A. Bozyiğit, S. Utku, and E. Nasıbov, “Cyberbullying detection: Utilizing social media features,” Expert Systems with Applications, vol. 179, p. 115001, Oct. 2021, doi: 10.1016/j.eswa.2021.115001.
B. Saichandana and P. Kamakshi, “Classification of Cyberbullying Detection in Social Networking with Audio using Machine Learning Approach,” International Journal on Recent and Innovation Trends in Computing and Communication, vol. 11, no. 7s, pp. 423–429, Jul. 2023, doi: 10.17762/ijritcc.v11i7s.7018.
M. A. Al-Ajlan and M. Ykhlef, "Optimized Twitter Cyberbullying Detection based on Deep Learning," 2018 21st Saudi Computer Society National Computer Conference (NCC), Riyadh, Saudi Arabia, 2018, pp. 1-5, doi: 10.1109/NCG.2018.8593146.
J. Yadav, D. Kumar, and D. Chauhan, “Cyberbullying Detection using Pre-Trained BERT Model,” 2020 International Conference on Electronics and Sustainable Communication Systems (ICESC), Jul. 2020, doi: 10.1109/icesc48915.2020.9155700.
A. Desai, S. Kalaskar, O. Kumbhar, and R. Dhumal, “Cyber Bullying Detection on Social Media using Machine Learning,” ITM Web of Conferences, vol. 40, p. 03038, Jan. 2021, doi: 10.1051/itmconf/20214003038.
A. Alabdulwahab, M. A. Haq, and M. S. Alshehri, “Cyberbullying Detection using Machine Learning and Deep Learning,” International Journal of Advanced Computer Science and Applications, vol. 14, no. 10, Jan. 2023, doi: 10.14569/ijacsa.2023.0141045.
P. K. Roy and F. U. Mali, “Cyberbullying detection using deep transfer learning,” Complex & Intelligent Systems, vol. 8, no. 6, pp. 5449–5467, May 2022, doi: 10.1007/s40747-022-00772-z.
M. Raj, S. Singh, K. Solanki, and R. Selvanambi, “An application to detect cyberbullying using machine learning and deep learning techniques,” SN Computer Science, vol. 3, no. 5, Jul. 2022, doi: 10.1007/s42979-022-01308-5.
S. M. Fati, A. Muneer, A. Alwadain, and A. O. Balogun, “Cyberbullying detection on Twitter using Deep Learning-Based attention mechanisms and continuous bag of words feature extraction,” Mathematics, vol. 11, no. 16, p. 3567, Aug. 2023, doi: 10.3390/math11163567.
V. L. Paruchuri and P. Rajesh, “CyberNet: a hybrid deep CNN with N-gram feature selection for cyberbullying detection in online social networks,” Evolutionary Intelligence (Print), vol. 16, no. 6, pp. 1935–1949, Sep. 2022, doi: 10.1007/s12065-022-00774-3.
C. Van Hee et al., “Automatic detection of cyberbullying in social media text,” PloS One, vol. 13, no. 10, p. e0203794, Oct. 2018, doi: 10.1371/journal.pone.0203794.
Md. T. Hasan, Md. A. E. Hossain, Md. S. H. Mukta, A. Akter, M. Ahmed, and S. Islam, “A review on Deep-Learning-Based Cyberbullying Detection,” Future Internet, vol. 15, no. 5, p. 179, May 2023, doi: 10.3390/fi15050179.
P. Yi and A. Zubiaga, “Session-based cyberbullying detection in social media: A survey,” Online Social Networks and Media, vol. 36, p. 100250, Jul. 2023, doi: 10.1016/j.osnem.2023.100250.
Md. T. Hasan, Md. A. E. Hossain, Md. S. H. Mukta, A. Akter, M. Ahmed, and S. Islam, “A review on Deep-Learning-Based Cyberbullying Detection,” Future Internet, vol. 15, no. 5, p. 179, May 2023, doi: 10.3390/fi15050179.
N. Ananthi, “Cyber Bullying Types Datasets,” IEEE Data Port, Aug. 31, 2021. https://ieee-dataport.org/documents/cyber-bullying-types-datasets
S. K. Singh, K. Kumar, and B. Kumar, Sentiment Analysis of Twitter Data Using TF-IDF and Machine Learning Techniques. 2022. doi: 10.1109/com-it-con54601.2022.9850477.
A. Singh, M. Jenamani, J. J. Thakkar, and Y. K. Dwivedi, “A text analytics framework for performance assessment and weakness detection from online reviews,” Journal of Global Information Management, vol. 30, no. 8, pp. 1–26, Jul. 2022, doi: 10.4018/jgim.304069.
A. Vaswani et al., “Attention is All you Need,” arXiv (Cornell University), vol. 30, pp. 5998–6008, Jun. 2017.
J. Vig, “Visualizing attention in Transformer-Based Language Representation models,” arXiv (Cornell University), Apr. 2019.
K. Zhang, P. Xu and J. Zhang, "Explainable AI in Deep Reinforcement Learning Models: A SHAP Method Applied in Power System Emergency Control," 2020 IEEE 4th Conference on Energy Internet and Energy System Integration (EI2), Wuhan, China, 2020, pp. 711-716, doi: 10.1109/EI250167.2020.9347147.
Download PDF: 308 times
Copyright (c) 2024 Journal of Information Systems and Informatics
This work is licensed under a Creative Commons Attribution 4.0 International License.
- I certify that I have read, understand and agreed to the Journal of Information Systems and Informatics (Journal-ISI) submission guidelines, policies and submission declaration. Submission already using the provided template.
- I certify that all authors have approved the publication of this and there is no conflict of interest.
- I confirm that the manuscript is the authors' original work and the manuscript has not received prior publication and is not under consideration for publication elsewhere and has not been previously published.
- I confirm that all authors listed on the title page have contributed significantly to the work, have read the manuscript, attest to the validity and legitimacy of the data and its interpretation, and agree to its submission.
- I confirm that the paper now submitted is not copied or plagiarized version of some other published work.
- I declare that I shall not submit the paper for publication in any other Journal or Magazine till the decision is made by journal editors.
- If the paper is finally accepted by the journal for publication, I confirm that I will either publish the paper immediately or withdraw it according to withdrawal policies
- I Agree that the paper published by this journal, I transfer copyright or assign exclusive rights to the publisher (including commercial rights)