Indonesian Health Question Multi-Class Classification Based on Deep Learning
DOI:
https://doi.org/10.51519/journalisi.v6i3.838Keywords:
Health Question, Text Classification, Deep Learning, IndoBERTAbstract
The health online forum is commonly used by Indonesian to ask questions related to diseases. A well-known example, Alodokter, has hundreds of thousands of health questions which are assigned to certain topics. Building a model to classify questions into a topic is important for better organization and faster response by relevant health professionals. This research experimented on 20 deep learning methods from RNN, CNN, and IndoBERT with different configurations to see the performance of each model when classifying questions into six different most common diseases that cause death in Indonesia. The results show the majority of the model can outperform the SVM as baseline. Bidirectional RNN such BiLSTM and BiGRU combined with CNN show a good metric score even though a certain version of the IndoBERT model generally outperforms all the other models.
Downloads
References
Ministry of Health of the Republic of Indonesia, "Indonesia Health Profile 2019," Jakarta: Ministry of Health of the Republic of Indonesia, 2020.
Y. A. Singgalen, "Sentiment Analysis on Customer Perception towards Products and Services of Restaurant in Labuan Bajo," J. Inf. Syst. Inform., vol. 4, no. 3, pp. 511-523, 2022.
P. R. A. Savitri, I. M. A. D. Suarjaya, and W. O. Vihikan, "Sentiment Analysis of X (Twitter) Comments on The Influence of South Korean Culture in Indonesia," J. Inf. Syst. Inform., vol. 6, no. 2, pp. 979-991, 2024.
P. A. Setiawati, I. M. A. D. Suarjaya, and I. N. P. Trisna, "Sentiment Analysis of Unemployment in Indonesia During and Post COVID-19 on X (Twitter) Using Naïve Bayes and Support Vector Machine," J. Inf. Syst. Inform., vol. 6, no. 2, pp. 662-675, 2024.
N. Limsopatham, "Effectively leveraging BERT for legal document classification," in Proc. Nat. Legal Lang. Process. Workshop 2021, 2021, pp. 210-216.
W. O. Vihikan, M. Mistica, I. Levy, A. Christie, and T. Baldwin, "Automatic resolution of domain name disputes," in Proc. Nat. Legal Lang. Process. Workshop 2021, 2021, pp. 228-238.
X. Li, M. Cui, J. Li, R. Bai, Z. Lu, and U. Aickelin, "A hybrid medical text classification framework: Integrating attentive rule construction and neural network," Neurocomputing, vol. 443, pp. 345-355, 2021.
S. K. Prabhakar and D.-O. Won, "Medical text classification using hybrid deep learning models with multihead attention," Comput. Intell. Neurosci., vol. 2021, no. 1, p. 9425655, 2021.
N. Arif, S. Latif, and R. Latif, "Question Classification Using Universal Sentence Encoder and Deep Contextualized Transformer," in Proc. 2021 14th Int. Conf. Develop. eSyst. Eng. (DeSE), 2021, pp. 206-211.
D. Han, T. Tohti, and A. Hamdulla, "Attention-based transformer-BiGRU for question classification," Information, vol. 13, no. 5, p. 214, 2022.
A. F. Abdillah, P. Putra, C. Bagus, S. Juanita, and D. Purwitasari, "Ensemble-based Methods for Multi-label Classification on Biomedical Question-Answer Data," J. Inf. Syst. Eng. Bus. Intell., vol. 8, no. 1, 2022.
N. A. Salsabila, Y. A. Winatmoko, A. A. Septiandri, and A. Jamal, "Colloquial Indonesian Lexicon," in Proc. 2018 Int. Conf. Asian Lang. Process. (IALP), 2018, pp. 226-229.
J. Asian, Effective techniques for Indonesian text retrieval, Melbourne, Australia: RMIT University, 2007.
A. Z. Arifin, I. Mahendra, and H. T. Ciptaningtyas, "Enhanced confix stripping stemmer and ants algorithm for classifying news document in Indonesian language," in Proc. Int. Conf. Inf. Commun. Technol. Syst., 2009, vol. 5, pp. 149-158.
A. D. Tahitoe and D. Purwitasari, "Implementasi modifikasi enhanced confix stripping stemmer untuk bahasa indonesia dengan metode corpus based stemming," J. Ilm., vol. 12, no. 15, pp. 1-15, 2010.
A. K. Darmawan, M. W. Al Wajieh, M. B. Setyawan, T. Yandi, and H. Hoiriyah, "Hoax news analysis for the Indonesian national capital relocation public policy with the support vector machine and random forest algorithms," J. Inf. Syst. Inform., vol. 5, no. 1, pp. 150-173, 2023.
M. Zulqarnain, A. K. Z. Alsaedi, R. Ghazali, M. G. Ghouse, W. Sharif, and N. A. Husaini, "A comparative analysis on question classification task based on deep learning approaches," PeerJ Comput. Sci., vol. 7, p. e570, 2021.
Y. Zhang and Z. Rao, "n-BiLSTM: BiLSTM with n-gram Features for Text Classification," in Proc. 2020 IEEE 5th Inf. Technol. Mechatronics Eng. Conf. (ITOEC), 2020, pp. 1056-1059.
A. A. Sharfuddin, M. N. Tihami, and M. S. Islam, "A deep recurrent neural network with bilstm model for sentiment classification," in Proc. 2018 Int. Conf. Bangla Speech Lang. Process. (ICBSLP), 2018, pp. 1-4.
L. Zhou and X. Bian, "Improved text sentiment classification method based on BiGRU-Attention," J. Phys.: Conf. Ser., vol. 1345, no. 3, p. 032097, 2019.
H. Wang, J. He, X. Zhang, and S. Liu, "A short text classification method based on N‐gram and CNN," Chin. J. Electron., vol. 29, no. 2, pp. 248-254, 2020.
E. D. Ajik, G. N. Obunadike, and F. O. Echobu, "Fake News Detection Using Optimized CNN and LSTM Techniques," J. Inf. Syst. Inform., vol. 5, no. 3, pp. 1044-1057, 2023.
J. Zheng and L. Zheng, "A hybrid bidirectional recurrent convolutional neural network attention-based model for text classification," IEEE Access, vol. 7, pp. 106673-106685, 2019.
P. Bojanowski, E. Grave, A. Joulin, and T. Mikolov, "Enriching Word Vectors with Subword Information," Trans. Assoc. Comput. Linguistics, vol. 5, p. 135, 2017.
F. Koto, A. Rahimi, J. H. Lau, and T. Baldwin, "IndoLEM and IndoBERT: a benchmark dataset and pre-trained language model for Indonesian NLP," in Proc. COLING 2020-28th Int. Conf. Comput. Linguistics, 2020, pp. 757-770.
B. Wilie et al., "IndoNLU: Benchmark and Resources for Evaluating Indonesian Natural Language Understanding," in Proc. 1st Conf. Asia-Pacific Chapter Assoc. Comput. Linguistics 10th Int. Joint Conf. Natural Lang. Process., 2020, pp. 843-857.
A. Vaswani et al., "Attention is All You Need," presented at the Proc. 31st Int. Conf. Neural Inf. Process. Syst., Long Beach, CA, USA, 2017.
A. Merchant, E. Rahimtoroghi, E. Pavlick, and I. Tenney, "What Happens To BERT Embeddings During Fine-tuning?," in Proc. Third BlackboxNLP Workshop Analyzing Interpreting Neural Netw. NLP, 2020, pp. 33-44.
I. Budiman et al., "Classification Performance Comparison of BERT and IndoBERT on Self-Report of COVID-19 Status on Social Media," J. Comput. Sci. Inst., vol. 30, pp. 61-67, 2024.
S. Saadah, K. M. Auditama, A. A. Fattahila, F. I. Amorokhman, A. Aditsania, and A. A. Rohmawati, "Implementation of BERT, IndoBERT, and CNN-LSTM in Classifying Public Opinion About COVID-19 Vaccine in Indonesia," J. RESTI (Rekayasa Sist. dan Teknol. Inform.), vol. 6, no. 4, pp. 648-655, 2022.
M. I. K. Sinapoy, Y. Sibaroni, and S. S. Prasetyowati, "Comparison of LSTM and IndoBERT Method in Identifying Hoax On Twitter," J. RESTI (Rekayasa Sist. dan Teknol. Inform.), vol. 7, no. 3, pp. 657-662, 2023.
P. F. Wright and F. L. Marston, "The Detection of Respiratory Infections," N. Engl. J. Med., vol. 282, no. 4, pp. 203-209, 1970.
P. J. Barnes, "Mechanisms of Development of Multidrug-Resistant Tuberculosis," Clin. Chest Med., vol. 30, no. 4, pp. 521-530, 2009.
Downloads
Published
Issue
Section
License
Authors Declaration
- The Authors certify that they have read, understood, and agreed to the Journal of Information Systems and Informatics (JournalISI) submission guidelines, policies, and submission declaration. The submission has been prepared using the provided template.
- The Authors certify that all authors have approved the publication of this manuscript and that there is no conflict of interest.
- The Authors confirm that the manuscript is their original work, has not received prior publication, is not under consideration for publication elsewhere, and has not been previously published.
- The Authors confirm that all authors listed on the title page have contributed significantly to the work, have read the manuscript, attest to the validity and legitimacy of the data and its interpretation, and agree to its submission.
- The Authors confirm that the manuscript is not copied from or plagiarized from any other published work.
- The Authors declare that the manuscript will not be submitted for publication in any other journal or magazine until a decision is made by the journal editors.
- If the manuscript is finally accepted for publication, the Authors confirm that they will either proceed with publication immediately or withdraw the manuscript in accordance with the journal’s withdrawal policies.
- The Authors agree that, upon publication of the manuscript in this journal, they transfer copyright or assign exclusive rights to the publisher, including commercial rights














