Detecting Data Leakage in Cloud Storage Using Decision Tree Classification
DOI:
https://doi.org/10.51519/journalisi.v7i3.1215Keywords:
Cloud Storage, Data Leakage Detection, Decision Tree, GridSearchCV, Machine LearningAbstract
Data leakage in cloud storage systems poses a significant security threat, potentially leading to unauthorized access, loss of sensitive information, and operational disruptions. This research proposes a classification model for detecting potential data leakage incidents using the Decision Tree algorithm. The dataset, obtained from the Kaggle public repository, contains user activity logs representing both normal and anomalous behaviors in cloud storage environments. Several preprocessing steps were applied to improve model quality, including handling missing values, removing outliers, and converting categorical data into numerical form. Hyperparameter optimization was performed using GridSearchCV to determine the best configuration for the Decision Tree classifier. Experimental results demonstrate that the optimized model achieved high classification performance, with an accuracy of 70,84%, a precision of 55% for the data leakage class, and an F1-score of 40%. The analysis also highlights the significance of certain features, such as multi-factor authentication usage and access to confidential data, in predicting potential leakage events. This study provides a theoretical contribution by \establishing a robust methodology for applying Decision Tree algorithms to a novel cloud security dataset, offering a scalable and interpretable framework for automated threat detection.
Downloads
References
D. D. Firmansyah Putri and M. H. Fahrozi, “Upaya Pencegahan Kebocoran Data Konsumen Melalui Pengesahan Ruu Perlindungan Data Pribadi (Studi Kasus E-Commerce Bhinneka.Com),” Borneo Law Rev., vol. 5, no. 1, pp. 46–68, 2021, doi: 10.35334/bolrev.v5i1.2014.
L. Tantowi and L. Wijayanti, “Peluang Dan Tantangan Penyimpanan Cloud Storage Pada Dokumen Digital,” Shaut Al-Maktabah J. Perpustakaan, Arsip dan Dokumentasi, vol. 15, no. 1, pp. 118–131, 2023, doi: 10.37108/shaut.v15i1.803.
R. Rifany, M. D. Prakoso, and P. D. Laksono, “Analisis Dampak Cloud Computing terhadap Keamanan Sistem dan Data,” Semin. Nas. TEKNOKA, vol. 8, no. 2502, pp. 01–06, 2023.
A. F. Mahmud and S. Wirawan, “Sistemasi: Jurnal Sistem Informasi Deteksi Phishing Website menggunakan Machine Learning Metode Klasifikasi Phishing Website Detection using Machine Learning Classification Method,” vol. 13, no. 4, pp. 2540–9719, 2024.
M. Fadhlurrohman, A. Muliawati, and B. Hananto, “Analisis Kinerja Intrusion Detection System pada Deteksi Anomali dengan Metode Decision Tree Terhadap Serangan Siber,” J. Ilmu Komput. dan Agri-Informatika, vol. 8, no. 2, pp. 90–94, 2021, doi: 10.29244/jika.8.2.90-94.
A. Halim Lubis, Y. Fadillah Harahap, and P. Studi Ilmu Komputer, “Analisis Sentimen Masyarakat Terhadap Resesi Ekonomi Global 2023 Menggunakan Algoritma Naïve Bayes Classifier,” J. Ilm. Elektron. Dan Komput., vol. 16, no. 2, pp. 442–450, 2023.
M. S. Hasibuan and A. Serdano, “Analisis Sentimen Kebijakan Pembelajaran Tatap Muka Menggunakan Support Vector Machine dan Naive Bayes,” JRST (Jurnal Ris. Sains dan Teknol., vol. 6, no. 2, pp. 199–204, 2022.
M. R. Fatiha, I. Setiawan, A. N. Ikhsan, and I. R. Yunita, “Optimisasi Sistem Deteksi Phishing Berbasis WeB,” J. Ilm. IT CIDA, vol. 10, no. 2, pp. 97–108, 2024.
S. Yuan, H. Li, X. Qian, W. Jiang, and G. Xu, “OnePath: Efficient and Privacy-Preserving Decision Tree Inference in the Cloud,” arXiv (Cornell Univ., pp. 1–12, 2024, doi: arXiv:2409.19334.
M. A. Nugroho and R. Kartadie, “Cloud Storage Dengan Teknologi Kubernetes Untuk Platform Collaborative Research,” JIPI (Jurnal Ilm. Penelit. dan Pembelajaran Inform., vol. 6, no. 1, pp. 74–81, 2021, doi: 10.29100/jipi.v6i1.1908.
A. C. Darmawan, “Pengembanga Aplikasi Berbasis Web dengan Python Flask untuk Klasifikasi Data Menggunakan Metode Decision Tree C4.5,” Universitas Islam Indonesia, 2022.
A. Fahri and Y. Ramdhani, “Visualisasi Data dan Penerapan Machine Learning Menggunakan Decision Tree Untuk Keputusan Layanan Kesehatan COVID-19,” J. Tekno Kompak, vol. 17, no. 2, p. 50, 2023, doi: 10.33365/jtk.v17i2.2438.
R. N. Ramadhon, A. Ogi, A. P. Agung, R. Putra, S. S. Febrihartina, and U. Firdaus, “Implementasi Algoritma Decision Tree untuk Klasifikasi Pelanggan Aktif atau Tidak Aktif pada Data Bank,” Karimah Tauhid, vol. 3, no. 2, pp. 1860–1874, 2024, doi: 10.30997/karimahtauhid.v3i2.11952.
D. A. Setyawan, “Pengembangan Metode Decision Tree Dengan Diskritisasi Data Dan Splitting Atribut Menggunakan Hierarchical Clustering Dan,” Institut Teknologi Sepuluh Nopember Surabaya, 2020.
S. M. Prasetiyo, T. U. Ningsih, B. Hakim, and A. A. R. Putra, “Jurnal Managemen Proyek Informatika Artificial Intelligence Vision Engineer,” BULLET J. Multidisiplin Ilmu, vol. 01, no. 6, pp. 987–991, 2022.
M. Ţălu, “Exploring Machine Learning Algorithms to Enhance Cloud Comput‑ ing Security,” Digit. Technol. Res. Appl., vol. 4, no. 2, pp. 33–47, 2025, doi: 10.54963/dtra.v4i2.1272.
A. B. Nassif, M. A. Talib, Q. Nasir, H. Albadani, and F. M. Dakalbab, “Machine Learning for Cloud Security: A Systematic Review,” IEEE Access, vol. 9, pp. 20717–20735, 2021, doi: 10.1109/ACCESS.2021.3054129.
S. V. Bhaskaran and S. Achar, “a Study of Evolving Cloud Computing Data Security: a Machine Learning Perspective,” Int. J. Prof. Bus. Rev., vol. 10, no. 3, p. e05315, 2025, doi: 10.26668/businessreview/2025.v10i3.5315.
Z. M. J. Nafis, R. Nazilla, R. Nugraha, and S. ’Uyun Shofwatul ’Uyun, “Perbandingan Algoritma Decision Tree dan K-Nearest Neighbor untuk Klasifikasi Serangan Jaringan IoT,” Komputika J. Sist. Komput., vol. 13, no. 2, pp. 245–252, 2024, doi: 10.34010/komputika.v13i2.12609.
F. A. Oktavirahani and R. Maharesi, “Implementasi Algoritma Decision Tree Cart Untuk Merekomendasikan Ukuran Baju,” JURIKOM (Jurnal Ris. Komputer), vol. 9, no. 1, p. 138, 2022, doi: 10.30865/jurikom.v9i1.3838.
A. Rasyid, S. Gilbijatno, A. W. Pramudya, D. Prasetyo, and T. Informatika, “Implementasi Algoritma Decision Tree CART untuk Deteksi Dini,” Pros. Semin. Nas. Teknol. Dan Sains Tahun, vol. 4, pp. 440–445, 2025.
D. Muriyatmoko, A. Musthafa, and M. H. Wijaya, “Klasifikasi Profil Kelulusan Nilai AKPAM Dengan Metode Decision Tree,” Semin. Nas. Sains dan Teknol. 2024 Fak., no. April, pp. 448–453, 2024.
R. E. Nugroho, W. Y. Pamungkas, and J. H. Jaman, “Pendeteksi Penyakit Hepatitis Menggunakan Cart Decision Tree,” J. Inform. dan Tek. Elektro Terap., vol. 12, no. 3S1, pp. 3690–3696, 2024, doi: 10.23960/jitet.v12i3s1.5184.
R. Muzayanah, D. A. A. Pertiwi, M. Ali, and M. A. Muslim, “Comparison of gridsearchcv and bayesian hyperparameter optimization in random forest algorithm for diabetes prediction,” J. Soft Comput. Explor., vol. 5, no. 1, pp. 86–91, 2024, doi: 10.52465/joscex.v5i1.308.
K. Alemerien, S. Alsarayreh, and E. Altarawneh, “Diagnosing Cardiovascular Diseases using Optimized Machine Learning Algorithms with GridSearchCV,” J. Appl. Data Sci., vol. 5, no. 4, pp. 1539–1552, 2024, doi: 10.47738/jads.v5i4.280.
Downloads
Published
Issue
Section
License
Authors Declaration
- The Authors certify that they have read, understood, and agreed to the Journal of Information Systems and Informatics (JournalISI) submission guidelines, policies, and submission declaration. The submission has been prepared using the provided template.
- The Authors certify that all authors have approved the publication of this manuscript and that there is no conflict of interest.
- The Authors confirm that the manuscript is their original work, has not received prior publication, is not under consideration for publication elsewhere, and has not been previously published.
- The Authors confirm that all authors listed on the title page have contributed significantly to the work, have read the manuscript, attest to the validity and legitimacy of the data and its interpretation, and agree to its submission.
- The Authors confirm that the manuscript is not copied from or plagiarized from any other published work.
- The Authors declare that the manuscript will not be submitted for publication in any other journal or magazine until a decision is made by the journal editors.
- If the manuscript is finally accepted for publication, the Authors confirm that they will either proceed with publication immediately or withdraw the manuscript in accordance with the journal’s withdrawal policies.
- The Authors agree that, upon publication of the manuscript in this journal, they transfer copyright or assign exclusive rights to the publisher, including commercial rights














