An Empirical Comparison of C4.5, Naive Bayes, and KNN for Scholarship Selection
DOI:
https://doi.org/10.63158/journalisi.v8i3.1617Keywords:
Scholarship Classification, Machine Learning, Comparative Benchmarking, Cross-Validation, Student Data MiningAbstract
Scholarship selection is a critical process in higher education that requires objective, fair, and efficient evaluation of applicants based on academic and socio-economic criteria. However, manual assessment methods are often vulnerable to bias, inconsistency, and administrative inefficiencies, which may affect the transparency and quality of decision-making. This study compares the performance of three supervised machine learning algorithms—C4.5 Decision Tree, Naive Bayes, and K-Nearest Neighbor (KNN)—for scholarship recipient classification. The dataset consisted of 1,500 student records obtained from the KelasAI repository and included ten predictor attributes, namely Grade Point Average, Parental Income, Academic Semester, Family Dependents, Organizational Involvement, Academic Achievement, Regional Origin, Scholarship Type, National Examination Score, and Economic Status. The target variable was categorized into Accepted and Rejected classes. Experiments were conducted using RapidMiner Studio with 10-fold stratified cross-validation to ensure reliable model evaluation. The results showed that Naive Bayes achieved the best performance, with 81.6% accuracy, 81.8% precision, and 81.3% recall, outperforming C4.5 and KNN. These findings demonstrate the potential of machine learning to support more transparent and data-driven scholarship selection processes.
Downloads
References
[1] H. U. Khan, F. V. Espiritu, and M. C. B. Natividad, "A new framework for scholarship predictor using a machine learning approach," Intelligent Automation & Soft Computing, vol. 39, no. 5, pp. 949–964, 2024. doi: 10.32604/iasc.2024.058466.
[2] P. Valdiviezo-Diaz and J. Chicaiza, "Prediction of academic outcomes using machine learning techniques: A survey of findings on higher education," Communications in Computer and Information Science, vol. 2049, pp. 218–232, 2024. doi: 10.1007/978-3-031-58956-0_16.
[3] N. Sghir, A. Adadi, and M. Lahmer, "Recent advances in predictive learning analytics: A decade systematic review (2012–2022)," Education and Information Technologies, vol. 28, no. 7, pp. 8299–8333, 2023. doi: 10.1007/s10639-022-11536-0.
[4] P. Nayak, S. Vaheed, S. Gupta, and N. Mohan, “Predicting students’ academic performance by mining the educational data through machine learning-based classification model,” Education and Information Technologies, vol. 28, no. 11, pp. 14611–14637, Nov. 2023, doi: 10.1007/s10639-023-11706-8.
[5] E. Alhazmi and A. Sheneamer, "Early predicting of students performance in higher education," IEEE Access, vol. 11, pp. 27579–27589, 2023. doi: 10.1109/ACCESS.2023.3258083.
[6] V. Sheth, P. Ramteke, V. Saxena, and A. Kumar, "A comparative analysis of machine learning classification algorithms for binary classification," Procedia Computer Science, vol. 215, pp. 422–431, 2022. doi: 10.1016/j.procs.2022.12.044.
[7] M. Yagci, "Educational data mining: Prediction of students' academic performance using machine learning algorithms," Smart Learning Environments, vol. 9, no. 1, p. 11, 2022. doi: 10.1186/s40561-022-00192-z.
[8] Y. Alshamaila, I. Al-Shourbaji, A. Alam et al., "An intelligent rule-oriented framework for extracting key factors for grants scholarships in higher education," International Journal of Data and Network Science, vol. 8, no. 2, pp. 1325–1340, 2024. doi: 10.5267/j.ijdns.2023.11.002.
[9] H. Karalar, C. Kapucu, and H. Gurler, "Predicting students at risk of academic failure using ensemble model during pandemic in a distance learning system," International Journal of Educational Technology in Higher Education, vol. 18, no. 1, p. 63, 2021. doi: 10.1186/s41239-021-00300-y.
[10] B. Albreiki, N. Zaki, and H. Alashwal, "A systematic literature review of student performance prediction using machine learning techniques," Education Sciences, vol. 11, no. 9, p. 552, 2021. doi: 10.3390/educsci11090552.
[11] G. Brotosaputro, E. Helmud, and R. Sulaiman, “Comparative Accuracy of Prediction Classification Using Supervised Machine Learning,” in Proceedings of the 2025 7th International Conference on Cybernetics and Intelligent System (ICORIS), Mataram, Indonesia, 2025, pp. 1–6, doi: 10.1109/ICORIS67789.2025.11296063.
[12] R. Alamri and B. Alharbi, "Explainable student performance prediction models: A systematic review," IEEE Access, vol. 9, pp. 33132–33143, 2022. doi: 10.1109/ACCESS.2022.3061502.
[13] A. Tholib, M. N. F. Hidayat, S. Yono, R. Wulanningrum, and E. Daniati, "Comparison of C4.5 and Naive Bayes for predicting student graduation using machine learning algorithms," International Journal of Engineering and Computer Science Applications (IJECSA), vol. 2, no. 2, pp. 71–78, 2023. doi: 10.30812/ijecsa.v2i2.3364.
[14] N. A. Kushartanto and R. T. Aldisa, "Data mining perbandingan algoritma K-Nearest Neighbor dan Naive Bayes dalam prediksi penerimaan beasiswa," Journal of Computer System and Informatics (JoSYC), vol. 5, no. 1, pp. 196–207, 2023. doi: 10.47065/josyc.v5i1.4566.
[15] P. Ramadani, R. Fadillah, Q. Adawiyah, and B. R. Al Ghazali, "Perbandingan algoritma Naive Bayes, C4.5, dan K-Nearest Neighbor untuk klasifikasi kelayakan Program Keluarga Harapan," Jurnal RESTI (Rekayasa Sistem dan Teknologi Informasi), vol. 8, no. 2, pp. 311–319, 2024. doi: 10.29207/resti.v8i2.5812.
[16] E. F. Wati, E. S. Perangin-Angin, and L. Indriyani, "Comparison of Naive Bayes and C4.5 methods with Particle Swarm Optimization on customer loyalty classification," IJISTECH (International Journal of Information System and Technology), vol. 8, no. 6, pp. 680–691, 2025. doi: 10.30645/ijistech.v8i6.382.
[17] V. Fitriyanti, G. Testiana, and C. E. Gunawan, "Klasifikasi predikat kelulusan mahasiswa menggunakan algoritma C4.5," Jurnal Saintekom: Sains, Teknologi, Komputer dan Manajemen, vol. 14, no. 2, pp. 217–232, 2024.
[18] F. Adiani, N. Fardiani, and F. Fitriyani, "Penerapan algoritma C4.5 untuk prediksi penerima beasiswa siswa berprestasi," JIKA (Jurnal Informatika), vol. 8, no. 4, pp. 465–474, 2024. doi: 10.31000/jika.v8i4.12117.
[19] N. T. Haryati, E. S. Negara, and T. B. Kurniawan, "Klasifikasi pemberian beasiswa berprestasi menggunakan perbandingan tiga algoritma," Jurnal TEKNOINFO, vol. 17, no. 1, pp. 145–152, 2023. doi: 10.33365/jti.v17i1.2423.
[20] A. Anwarudin, W. Andriyani, B. P. DP, and D. Kristomo, "The prediction on the students' graduation timeliness using Naive Bayes classification and K-Nearest Neighbor," Journal of Intelligent Software Systems, vol. 1, no. 1, pp. 75–88, 2022. doi: 10.26798/jiss.v1i1.597.
[21] W. I. Kurniawan and J. Triloka, "Application of Naive Bayes classifiers for family risk identification and stunting intervention planning," Journal of Applied Informatics and Computing, vol. 9, no. 5, pp. 1156–1165, 2025. doi: 10.30871/jaic.v9i5.9143.
[22] D. A. Shafiq, M. Marjani, R. A. A. Habeeb, and D. Asirvatham, "Student retention using educational data mining and predictive analytics: A systematic literature review," IEEE Access, vol. 10, pp. 72480–72503, 2022. doi: 10.1109/ACCESS.2022.3189214.
[23] M. B. Al-Zoubi, A. S. Al-Hashemi, and S. H. El-Gayar, "A review of educational data mining in higher education," International Journal of Advanced Computer Science and Applications, vol. 12, no. 5, pp. 458–467, 2021. doi: 10.14569/IJACSA.2021.0120652.
[24] S. Hussain and M. Q. Khan, "Student-Performulator: Predicting students' academic performance at secondary and intermediate level using machine learning," Annals of Data Science, vol. 10, no. 3, pp. 637–655, 2023. doi: 10.1007/s40745-021-00341-0.
[25] N. Aprilyani, I. Zulfa, and H. Syahputra, "Penerapan algoritma Decision Tree C4.5 untuk model penentuan penerima beasiswa Program Indonesia Pintar (PIP) studi kasus SMA Negeri 3 Timang Gajah," Jurnal Teknik Informatika dan Elektro, vol. 5, no. 1, pp. 23–34, 2022.
[26] B. Isnanto and R. Sulaiman, “Optimalisasi pembangunan desa: Prediksi kebutuhan intervensi ekonomi di Jawa Barat menggunakan algoritma machine learning,” Buffer Informatika, vol. 12, no. 1, pp. 80–86, 2026.
[27] M. B. Alqahtani and E. Alqahtani, "Educational data mining and predictive modeling in the age of artificial intelligence: An in-depth analysis of research dynamics," Computers, vol. 14, no. 2, p. 68, 2025. doi: 10.3390/computers14020068.
[28] S. Berutu, H. Budiati, J. Jatmika, and F. Gulo, "Data preprocessing approach for machine learning-based sentiment classification," Journal INFOTEL, vol. 15, no. 4, pp. 317–325, 2023. doi: 10.20895/infotel.v15i4.1030.
[29] M. Maharana, S. Mondal, and B. Nemade, "A review: Data pre-processing and data augmentation techniques," Global Transitions Proceedings, vol. 3, no. 1, pp. 253–260, 2022. doi: 10.1016/j.gltp.2022.04.020.
[30] K. Vujovic, "Classification model evaluation metrics," International Journal of Advanced Computer Science and Applications, vol. 12, no. 6, pp. 599–606, 2021. doi: 10.14569/IJACSA.2021.0120670.
Downloads
Published
Issue
Section
License
Copyright (c) 2026 Journal of Information Systems and Informatics

This work is licensed under a Creative Commons Attribution 4.0 International License.
Authors Declaration
- The Authors certify that they have read, understood, and agreed to the Journal of Information Systems and Informatics (JournalISI) submission guidelines, policies, and submission declaration. The submission has been prepared using the provided template.
- The Authors certify that all authors have approved the publication of this manuscript and that there is no conflict of interest.
- The Authors confirm that the manuscript is their original work, has not received prior publication, is not under consideration for publication elsewhere, and has not been previously published.
- The Authors confirm that all authors listed on the title page have contributed significantly to the work, have read the manuscript, attest to the validity and legitimacy of the data and its interpretation, and agree to its submission.
- The Authors confirm that the manuscript is not copied from or plagiarized from any other published work.
- The Authors declare that the manuscript will not be submitted for publication in any other journal or magazine until a decision is made by the journal editors.
- If the manuscript is finally accepted for publication, the Authors confirm that they will either proceed with publication immediately or withdraw the manuscript in accordance with the journal’s withdrawal policies.
- The Authors agree that, upon publication of the manuscript in this journal, they transfer copyright or assign exclusive rights to the publisher, including commercial rights














