Detection of SQL Injection, XSS, and Command Injection Attacks in Web Payloads Using SVM, Random Forest, and XGBoost

Authors

  • Andrian Eko Widodo Universitas Bina Sarana Informatika, Indonesia
  • Fabriyan Fandi Dwi Imaniawan Universitas Bina Sarana Informatika, Indonesia
Pages Icon

DOI:

https://doi.org/10.63158/journalisi.v8i3.1655

Keywords:

Web Application Security, Payload Classification, XGBoost, SHAP, Mutation Robustness

Abstract

Web application attacks, including SQL Injection (SQLi), Cross-Site Scripting (XSS), and Command Injection (CmdI), remain major threats to digital services. This study develops and evaluates an adversarial-aware protocol for multi-class malicious payload detection, focusing on accuracy, robustness against non-adaptive mutations, and practical inference feasibility. The protocol compares LinearSVC, Random Forest, and XGBoost with character-level neural baselines, namely character CNN and BiLSTM, and a transparent rule-based comparator. Evaluation integrates stratified sampling, deduplicated validation, mutation testing, SHAP-based interpretation, and end-to-end throughput measurement. Experiments used 49,998 stratified records from the SQLi-XSS-CommandInjection dataset in Google Colaboratory. On the internal test set, XGBoost obtained the best performance, achieving 99.28% accuracy and 99.32% macro F1-score. After removing 878 exact duplicate records for stricter re-evaluation, XGBoost maintained 99.21% accuracy and 99.24% macro F1-score, indicating that the findings were not driven solely by duplicate leakage. The complete preprocessing, feature extraction, and prediction pipeline reached an average CPU inference time of 0.832 ms per sample. SHAP analysis of Random Forest highlighted injection operators, script fragments, keyword hits, and structural tokens as discriminative features. The results provide a controlled benchmark, although validation on real HTTP logs remains future work.

Downloads

Download data is not yet available.

References

[1] J. H. R. Zuech and T. M. Khoshgoftaar, “Investigating rarity in web attacks with ensemble learners,” J. Big Data, 2021, doi: 10.1186/s40537-021-00462-6.

[2] V. V. J. R. Tadhani V. Sorathiya S. Alshathri and W. El-Shafai, “Securing web applications against XSS and SQLi attacks using a novel deep learning approach,” Sci. Rep., 2024, doi: 10.1038/s41598-023-48845-4.

[3] M. Hemmati and M. A. Hadavi, “Bypassing Web Application Firewalls Using Deep Reinforcement Learning,” ISeCure, 2022, doi: 10.22042/isecure.2022.323140.744.

[4] G. Lucz and B. Forstner, “A Thirty-Day Dataset of Malicious HTTP Requests Blocked by OWASP ModSecurity on a Production Web Server,” Data (Basel)., 2025, doi: 10.3390/data10110186.

[5] M. E. Durmuşkaya and S. Bayraklı, “Web application firewall based on machine learning models,” PeerJ Comput. Sci., 2025, doi: 10.7717/peerj-cs.2975.

[6] G. Floris, “ModSec-AdvLearn: Countering Adversarial SQL Injections With Robust Machine Learning,” IEEE Transactions on Information Forensics and Security, 2025, doi: 10.1109/TIFS.2025.3583234.

[7] S. S. V. Nithya and R. Regan, “Streamlining detection of input validation attack types through hybrid analysis and machine learning,” Sadhana - Academy Proceedings in Engineering Sciences, 2024, doi: 10.1007/s12046-024-02486-z.

[8] J. H. R. Zuech and T. M. Khoshgoftaar, “A new feature popularity framework for detecting cyberattacks using popular features,” J. Big Data, 2022, doi: 10.1186/s40537-022-00661-9.

[9] X. D. Hoang and T. H. Nguyen, “Detecting common web attacks based on supervised machine learning using web logs,” J. Theor. Appl. Inf. Technol., 2021.

[10] S. Pillai and D. A. Sharma, “Hybrid unsupervised web-attack detection and classification--A deep learning approach,” Comput. Stand. Interfaces, 2023, doi: 10.1016/j.csi.2023.103738.

[11] R. I. E. P. Ghani and A. Triwiyatno, “Detection and Mitigation Effectiveness of Injection and Remote Service Attacks: A Machine Learning-Based Evaluation,” in Proc. ISMEE, 2025. doi: 10.1109/ISMEE68179.2025.11473023.

[12] V. C. R. Branco and I. Medeiros, “Towards a Web Application Attack Detection System Based on Network Traffic and Log Classification,” in Proc. ENASE, 2024. doi: 10.5220/0012722800003687.

[13] J. B. S. A. Kumar and R. Agarwal, “Machine Learning-Based Web Application Firewall for Real-Time Threat Detection,” in Proc. IEEE ICEI, 2024. doi: 10.1109/ICEI64305.2024.10912239.

[14] N. Stevanović, B. Todorović, and V. Todorović, “Web attack detection based on traps,” Applied Intelligence, 2022, doi: 10.1007/s10489-021-03077-9.

[15] B. C. Z. Cheng T. Qi W. Yang and J. Fu, “An Improved Feature Extraction Approach for Web Anomaly Detection Based on Semantic Structure,” Security and Communication Networks, 2021, doi: 10.1155/2021/6661124.

[16] A. E. Takieldeen, “Web Attack Intrusion Detection System Using Machine Learning Approaches for Cybersecurity,” in Proc. ITC-Egypt, 2025. doi: 10.1109/ITC-Egypt66095.2025.11186700.

[17] J. H. Kumar and J. G. Ponsam, “Securing Web Application using Web Application Firewall and Machine Learning,” in Proc. ICAEECI, 2023. doi: 10.1109/ICAEECI58247.2023.10370872.

[18] W.-C. Y. L. Zhou Y. S. Gan and S.-T. Liong, “E-WebGuard: Enhanced neural architectures for precision web attack detection,” Comput. Secur., 2025, doi: 10.1016/j.cose.2024.104127.

[19] A. Shaheed and M. H. D. B. Kurdy, “Web Application Firewall Using Machine Learning and Features Engineering,” Security and Communication Networks, 2022, doi: 10.1155/2022/5280158.

[20] R. Bakır, “UniEmbed: A Novel Approach to Detect XSS and SQL Injection Attacks Leveraging Multiple Feature Fusion with Machine Learning Techniques,” Arab. J. Sci. Eng., 2025, doi: 10.1007/s13369-024-09916-4.

[21] J. Yang, “LLM-AE-MP: Web Attack Detection Using a Large Language Model with Autoencoder and Multilayer Perceptron,” Expert Syst. Appl., 2025, doi: 10.1016/j.eswa.2025.126982.

[22] J.-Á. Román-Gallego, M.-L. Pérez-Delgado, M. L. Viñuela, and M.-C. Vega-Hernández, “Artificial Intelligence Web Application Firewall for advanced detection of web injection attacks,” Expert Syst., 2025, doi: 10.1111/exsy.13505.

[23] C. X. T. Hu S. Zhang S. Tao and L. Li, “Cross-site scripting detection with two-channel feature fusion embedded in self-attention mechanism,” Comput. Secur., 2023, doi: 10.1016/j.cose.2022.102990.

[24] K. M. Manjunatha and M. Kempanna, “Count vectorizer model based web application vulnerability detection using artificial intelligence approach,” Journal of Discrete Mathematical Sciences and Cryptography, 2022, doi: 10.1080/09720529.2022.2133243.

[25] B. A. Meharaj and M. Arock, “Modified Parse-Tree Based Pattern Extraction Approach for Detecting SQLIA Using Neural Network Model,” ISeCure, 2024, doi: 10.22042/isecure.2023.370697.886.

[26] N. Yadav and N. Shekokar, “Preprocessing HTTP Requests and Dimension Reduction Technique for SQLI Detection,” in Lecture Notes in Networks and Systems, 2021. doi: 10.1007/978-3-030-67187-7_21.

[27] F. Younas, “An efficient artificial intelligence approach for early detection of cross-site scripting attacks,” Decision Analytics Journal, 2024, doi: 10.1016/j.dajour.2024.100466.

[28] R. Alhamyani and M. Alshammari, “Machine Learning-Driven Detection of Cross-Site Scripting Attacks,” Information, 2024, doi: 10.3390/info15070420.

[29] S. Sharma and N. S. Yadav, “A multilayer stacking classifier based on nature-inspired optimization for detecting cross-site scripting attack,” International Journal of Information Technology, 2023, doi: 10.1007/s41870-023-01459-5.

[30] Trustwave SpiderLabs, “ModSecurity Reference Manual,” 2026.

[31] OWASP Foundation, “OWASP ModSecurity Core Rule Set Documentation,” 2026.

Downloads

Published

2026-06-25

Issue

Section

Articles

Most read articles by the same author(s)