Detection of Hate Speech Code Mix Involving English and Other Nigerian Languages
Abstract
Hate speech is a recurrent event and has become a cause for global concern. The proliferation of hate speech has recently become prevalent, breeding room for violence and discrimination against specific individuals or groups. In Nigeria, message masking (use of language-mix) has become the new normal, especially in disseminating hateful and inciting comments. Hence, there is a need to curb the spread over social media. Therefore, this research focuses on detecting hate speech on social media with a code-mix of English, Pidgin and any of the three major Nigerian languages (Hausa, Igbo and Yoruba). The research used two machine learning algorithms: Support Vector Machine (SVM) and Random Forest (RF). Data were collected from tweets on the EndSARS protest and the 2023 Nigerian elections. The major features were extracted, and the text was converted into vectors using TF-IDF and Bag-of-words (BoW), which were used to train and test the model. The result showed that SVM performed better in classifying hate speech than RF on both TF-IDF and BoW features, averaging 93.43% for accuracy, 93.70% for precision, 93.43% for recall, and 93.57% for F1-score.
Downloads
References
A. Guterres, "United nations strategy and plan of action on hate speech," United Nations, New York, NY, USA, 2019.
S. MacAvaney, H. R. Yao, E. Yang, K. Russell, N. Goharian and O. Frieder, "``Hate speech detection: Challenges and solutions," PLoS ONE, vol. 14, no. 8, pp. 1-16, 2019.
B. Ross, M. Rist, G. Carbonell, B. Cabrera, N. Kurowsky and W. Wojatzki, "Measuring the reliability of hate speech annotations: The case of the European refugee crisis," in Proceedings of NLP4CMC III: 3rd Workshop on Natural Language Processing for Computer-Mediated Communication, Bochum, Germany, 2016.
C. E. Ring, "Hate speech IN social media: An exploration of the problem and its proposed solutions," Colorado, 2013.
E. C. o. H. Rights, "Annual Report 2017 of European Court of Human Rights, Council of Europe," ECHR, Strasbourg, France, 2017.
S. Abro, S. Shaikh, Z. H. Khand, Z. Ali, S. Khan and M. Ghulam, "Automatic Hate Speech Detection using Machine Learning: A Comparative Study," International Journal of Advanced Computer Science and Applications, (IJACSA), vol. 11, no. 8, pp. 1-8, 2020.
C. E. R. Salim and D. Suhartono, "A Systematic Literature Review of Different Machine Learning Methods on Hate Speech Detection," International Journal on Informatics Visualization, vol. 4, no. 4, pp. 1-6, 2020.
S. K. Mohapatra, S. Prasad, D. K. Bebarta, T. K. Das, K. Srinivasan and Y.-C. Hu, "Automatic Hate Speech Detection in English-Odia Code Mixed Social Media Data Using Machine Learning Techniques," Applied Science, vol. 11, pp. 1-21, 2021.
V. Pathak, M. Joshi, P. A. Joshi, M. Mundada and T. Joshi, "Using Machine Learning for Detection of Using Machine Learning for Detection of Social Media text," KBCNMUJAL, pp. 1-12, 2020.
H. Nayel and H. L. Shashirekha, "DEEP at HASOC2019: A Machine Learning Framework for Hate Speech and Offensive Language Detection," in FIRE 2019, Kolkata, India., 2019.
N. Aulia and I. Budi, "Hate Speech Detection on Indonesian Long Text Documents Using Machine Learning Approach," in International Conference on Computing and Artificial Intelligence (ICCAI), Bali, Indonesia, 2019.
I. Aljarah, M. Habib, N. Hijazi, H. Faris, R. Qaddoura, B. Hammo, M. Abushariah and M. Alfawareh, "Intelligent detection of hate speech in Arabic social network: A machine learning approach," Journal of Information Science (JIS), vol. 47, no. 4, pp. 2-19, 2021.
F. D. Vigna, A. Cimino, F. Dell’Orletta, M. Petrocchi and M. Tesconi, "Hate me, hate me not: Hate speech detection on Facebook," in In Proceedings of the First Italian Conference on Cybersecurity (ITASEC17), Venice, Italy, 2017.
B. Vidgen and T. Yasseri, "Detecting weak and strong Islamophobic hate speech on social media," Journal of Information Technology & Politics, pp. 1-14, 2019.
S. M. Aliyu, G. M. Wajiga, M. Murtala, S. H. Muhammad, I. Abdulmumin and I. S. Ahmad, "HERDPhobia: A Dataset for Hate Speech against Fulani in Nigeria," arXiv preprint arXiv:2211.15262., pp. 1-3, 2022.
M. Awad and R. Khanna, "Support Vector Machine for Classifiaction," in Efficient Learnhing Machines, Berkeley, CA., Apress, 2015, pp. 39-66.
A. W. Moore, "Tutorials," 19 February 2020. [Online]. Available: http://www.cs.cmu.edu/~awm/tutorials.html. [Accessed 19 February 2020].
V. Vapnik, S. Golowich and A. Smola, "Support vector method for function approximation, regression estimation, and signal processing," in In M. Mozer, M. Jordan, and T. Petsche, editors, Advances in Neural Information Processing Systems 9, Cambridge, MA, 1997.
R. Sutton and A. Barto, Learning: An Introduction, 1998.
N. Mohapatra, K. Shreya and A. Chinmay, "Optimization of the Random Forest Algorithm," in Advances in Data Science and Management. Lecture Notes on Data Engineering and Communications Technologies, vol. 37, Singapore, Springer, 2020, pp. 201-208.
Download PDF: 511 times
Copyright (c) 2023 Journal of Information Systems and Informatics
This work is licensed under a Creative Commons Attribution 4.0 International License.
- I certify that I have read, understand and agreed to the Journal of Information Systems and Informatics (Journal-ISI) submission guidelines, policies and submission declaration. Submission already using the provided template.
- I certify that all authors have approved the publication of this and there is no conflict of interest.
- I confirm that the manuscript is the authors' original work and the manuscript has not received prior publication and is not under consideration for publication elsewhere and has not been previously published.
- I confirm that all authors listed on the title page have contributed significantly to the work, have read the manuscript, attest to the validity and legitimacy of the data and its interpretation, and agree to its submission.
- I confirm that the paper now submitted is not copied or plagiarized version of some other published work.
- I declare that I shall not submit the paper for publication in any other Journal or Magazine till the decision is made by journal editors.
- If the paper is finally accepted by the journal for publication, I confirm that I will either publish the paper immediately or withdraw it according to withdrawal policies
- I Agree that the paper published by this journal, I transfer copyright or assign exclusive rights to the publisher (including commercial rights)