A Comparative Study of Drug Prediction Models using KNN, SVM, and Random Forest
Abstract
Accurate drug classification is essential in medical decision-making to ensure patients receive appropriate prescriptions based on their physiological and biochemical characteristics. This study compares the performance of K-Nearest Neighbors (KNN), Support Vector Machine (SVM), and Random Forest models in predicting drug prescriptions using patient attributes such as age, sex, blood pressure, cholesterol level, and sodium-to-potassium ratio. The dataset, obtained from Kaggle, was preprocessed and split into training and testing sets to evaluate model performance using accuracy as the primary metric. The results indicate that Random Forest outperformed KNN and SVM, achieving a perfect test accuracy of 100%, demonstrating superior generalization and robustness. SVM also performed well, with a test accuracy of 97.50%, while KNN achieved the lowest accuracy of 70%, indicating its limitations in handling complex feature interactions. These findings highlight the effectiveness of ensemble learning methods in medical classification tasks, suggesting that Random Forest is the most suitable model for drug prediction. Furthermore, the potential applications of these findings in clinical settings could enhance treatment outcomes and patient care. Future research should explore feature engineering techniques, larger datasets, and additional machine learning approaches to enhance predictive accuracy and applicability in real-world healthcare settings.
Downloads
References
C. Silpa, B. Sravani, D. Vinay, C. Mounika, and K. Poorvitha, “Drug Recommendation System in Medical Emergencies using Machine Learning,” in Proc. Int. Conf. Innov. Data Commun. Technol. Appl. (ICIDCA), 2023, pp. 107–112, doi: 10.1109/ICIDCA56705.2023.10099607.
C. Chen, “Research on Drug Classification Using Machine Learning Model,” Highlights Sci. Eng. Technol. (EMIS), vol. 2023, p. 350, 2024, doi: 10.54097/nfpj0845.
A. Harry, “Revolutionizing Healthcare: How Machine Learning is Transforming Patient Diagnoses—A Comprehensive Review of AI’s Impact on Medical Diagnosis,” BULLET: J. Multidiscip. Sci., vol. 2, pp. 1259–1266, 2023.
S. Crisafulli, A. Fontana, L. L’Abbate, G. Vitturi, A. Cozzolino, D. Gianfrilli, M. C. De Martino, B. Amico, C. Combi, and G. Trifirò, “Machine learning-based algorithms applied to drug prescriptions and other healthcare services in the Sicilian claims database to identify acromegaly as a model for the earlier diagnosis of rare diseases,” Sci. Rep., vol. 14, no. 1, p. 6186, 2024, doi: 10.1038/s41598-024-56240-w.
F. Aldi, I. Nozomi, and S. Soeheri, “Comparison of Drug Type Classification Performance Using KNN Algorithm,” SinkrOn, vol. 7, no. 3, pp. 1028–1034, Jul. 2022, doi: 10.33395/sinkron.v7i3.11487.
B. A. Badwan, G. Liaropoulos, E. Kyrodimos, D. Skaltsas, A. Tsirigos, and V. G. Gorgoulis, “Machine learning approaches to predict drug efficacy and toxicity in oncology,” Cell Rep. Methods, vol. 3, no. 2, 2023, doi: 10.1016/j.crmeth.2023.100413.
S. Dara, S. Dhamercherla, S. S. Jadav, C. M. Babu, and M. J. Ahsan, “Machine Learning in Drug Discovery: A Review,” Artif. Intell. Rev., vol. 55, no. 3, pp. 1947–1999, Mar. 2022, doi: 10.1007/s10462-021-10058-4.
H. Zhao, J. Zhong, X. Liang, C. Xie, and S. Wang, “Application of machine learning in drug side effect prediction: databases, methods, and challenges,” Front. Comput. Sci., vol. 19, no. 5, p. 195902, 2025, doi: 10.1007/s11704-024-31063-0.
F. Aldi, I. Nozomi, and S. Soeheri, “Comparison of Drug Type Classification Performance Using KNN Algorithm,” SinkrOn, vol. 7, no. 3, pp. 1028–1034, Jul. 2022, doi: 10.33395/sinkron.v7i3.11487.
R. Hoque, M. Billah, A. Debnath, S. M. S. Hossain, and N. B. Sharif, “Heart Disease Prediction using SVM,” Int. J. Sci. Res. Arch., vol. 11, no. 2, pp. 412–420, Mar. 2024, doi: 10.30574/ijsra.2024.11.2.0435.
R. Meenal, P. A. Michael, D. Pamela, and E. Rajasekaran, “Weather prediction using random forest machine learning model,” Indones. J. Electr. Eng. Comput. Sci., vol. 22, no. 2, pp. 1208–1215, May 2021, doi: 10.11591/ijeecs.v22.i2.pp1208-1215.
A. Rajdhan, A. Agarwal, and M. Sai, “Heart Disease Prediction using Machine Learning,” Int. J. Eng. Res. Technol. (IJERT), no. 4, Apr. 2020, doi: 10.17577/IJERTV9IS040614.
R. N. Ndanuko, R. Ibrahim, R. A. Hapsari, E. P. Neale, D. Raubenheimer, and K. E. Charlton, “Association between the urinary sodium to potassium ratio and blood pressure in adults: A systematic review and meta-analysis,” Adv. Nutr., vol. 12, no. 5, pp. 1751–1767, 2021, doi: 10.1093/advances/nmab036.
A. V. Chobanian, G. L. Bakris, H. R. Black, W. C. Cushman, L. A. Green, J. L. Izzo Jr., D. W. Jones, et al., “The seventh report of the joint national committee on prevention, detection, evaluation, and treatment of high blood pressure: The JNC 7 report,” JAMA, vol. 289, no. 19, pp. 2560–2571, 2003.
B. Lepri, J. Staiano, D. Sangokoya, E. Letouzé, and N. Oliver, “The tyranny of data? The bright and dark sides of data-driven decision-making for social good,” in Transparent Data Mining for Big and Small Data, Springer, 2017, pp. 3–24.
A. C. Müller and S. Guido, Introduction to Machine Learning with Python, O’Reilly Media, Inc, 2017.
R. Rodríguez-Pérez and J. Bajorath, “Evolution of Support Vector Machine and Regression Modeling in Chemoinformatics and Drug Discovery,” J. Comput. Aided Mol. Des., vol. 36, no. 5, pp. 355–362, May 2022, doi: 10.1007/s10822-022-00442-9.
O. A. Montesinos López, A. Montesinos López, and J. Crossa, “Overfitting, Model Tuning, and Evaluation of Prediction Performance,” in Multivariate Statistical Machine Learning Methods for Genomic Prediction, Springer Int. Publ., 2022, pp. 109–139, doi: 10.1007/978-3-030-89010-0_4.
M. Rizki, A. Hermawan, and D. Avianto, “Optimization of Hyperparameter K in K-Nearest Neighbor Using Particle Swarm Optimization,” JUITA: J. Inform., vol. 12, no. 1, pp. 71–79, 2024.
N. Gul, M. Aamir, S. Aldahmani, and Z. Khan, “A Weighted k-Nearest Neighbours Ensemble with added Accuracy and Diversity,” IEEE Access, vol. 10, pp. 125920–125929, Nov. 2022, doi: 10.1109/ACCESS.2022.3225682.
R. Guido, S. Ferrisi, D. Lofaro, and D. Conforti, “An Overview on the Advancements of Support Vector Machine Models in Healthcare Applications: A Review,” Inf., vol. 15, no. 4, 2024, doi: 10.3390/info15040235.
J. Yang, Z. Wu, K. Peng, P. N. Okolo, W. Zhang, H. Zhao, and J. Sun, “Parameter selection of Gaussian kernel SVM based on local density of training set,” Inverse Probl. Sci. Eng., vol. 29, no. 4, pp. 536–548, 2021, doi: 10.1080/17415977.2020.1797716.
I. S. Al-Mejibli, J. K. Alwan, and D. H. Abd, “The effect of gamma value on support vector machine performance with different kernels,” Int. J. Electr. Comput. Eng., vol. 10, no. 5, pp. 5497–5506, Oct. 2020, doi: 10.11591/IJECE.V10I5.PP5497-5506.
S. Tangirala, “Evaluating the impact of GINI index and information gain on classification using decision tree classifier algorithm,” Int. J. Adv. Comput. Sci. Appl., no. 2, pp. 612–619, 2020, doi: 10.14569/ijacsa.2020.0110277.
H. A. Salman, A. Kalakech, and A. Steiti, “Random Forest Algorithm Overview,” Babylon. J. Mach. Learn., vol. 2024, pp. 69–79, Jun. 2024, doi: 10.58496/bjml/2024/007.
H. A. Salman, A. Kalakech, and A. Steiti, “Random Forest Algorithm Overview,” Babylon. J. Mach. Learn., vol. 2024, pp. 69–79, Jun. 2024, doi: 10.58496/bjml/2024/007.
N. S. Thomas and S. Kaliraj, “An Improved and Optimized Random Forest Based Approach to Predict the Software Faults,” SN Comput. Sci., vol. 5, no. 5, Jun. 2024, doi: 10.1007/s42979-024-02764-x.


Copyright (c) 2025 Journal of Information Systems and Informatics

This work is licensed under a Creative Commons Attribution 4.0 International License.
- I certify that I have read, understand and agreed to the Journal of Information Systems and Informatics (Journal-ISI) submission guidelines, policies and submission declaration. Submission already using the provided template.
- I certify that all authors have approved the publication of this and there is no conflict of interest.
- I confirm that the manuscript is the authors' original work and the manuscript has not received prior publication and is not under consideration for publication elsewhere and has not been previously published.
- I confirm that all authors listed on the title page have contributed significantly to the work, have read the manuscript, attest to the validity and legitimacy of the data and its interpretation, and agree to its submission.
- I confirm that the paper now submitted is not copied or plagiarized version of some other published work.
- I declare that I shall not submit the paper for publication in any other Journal or Magazine till the decision is made by journal editors.
- If the paper is finally accepted by the journal for publication, I confirm that I will either publish the paper immediately or withdraw it according to withdrawal policies
- I Agree that the paper published by this journal, I transfer copyright or assign exclusive rights to the publisher (including commercial rights)