Traffic Violation Clustering Using K-Medoids and Word Cloud Visualization

  • Muhammad Sabri S Universitas Amikom Yogyakarta, Indonesia
  • Ema Utami Universitas Amikom Yogyakarta, Indonesia
Keywords: Violations, Clustering, K-Medoids, PCA, Elbow Method, Silhouette Score, Word Cloud

Abstract

Traffic is the space for people to move around, including both drivers and pedestrians. According to data from the Central Statistics Agency in 2020, the number of motor vehicles in Makassar City was recorded by type: 248,682 passenger cars, 17,501 buses, 85,968 trucks, and 1,338,306 motorcycles, with a tendency for an increase in the following year. The high number of vehicle users can certainly affect the rising traffic violation rates on the road. This study aims to classify traffic violation types in Makassar City by utilizing the K-Medoids algorithm and to visualize the clustering results using Word Cloud, which is expected to provide information related to patterns of traffic violation clusters. This study uses a case study from the Traffic Police Department of Makassar City in 2021, with a total of 5,893 traffic violation cases. The data used is ticket data consisting of article and vehicle type features. The clustering results show that motorcycles and minibuses are the most frequently involved in traffic violations. Motorcycles (R2) are not only dominated by violations related to the use of standard SNI helmets but also significantly involved in violations related to incomplete requirements and the possession of SIM/STNK (Driver's License/Vehicle Registration) and failing to meet roadworthiness standards such as mirrors, headlights, horns, etc. Passenger vehicles, especially minibuses and cars, also dominate traffic violations. The violations involve not only the use of seat belts for R4 vehicles but also violations such as not having complete STNK, not being able to show SIM, failing to display the Vehicle Registration Mark (TKB), and others. The results of this study demonstrate that the clustering obtained is very strong, as evidenced by the high Silhouette Score of 0.867 at k = 9.

Downloads

Download data is not yet available.

References

T. Mussweiler, “Focus of Comparison as a Determinant of Assimilation Versus,” Personal. Soc. Psychol. Bull., vol. 27, pp. 38–47, 1997, doi: 10.1145/3054925.

S. Tufféry, “Statistical and Data Mining Software,” Data Min. Stat. Decis. Mak., pp. 111–166, 2011, doi: 10.1002/9780470979174.ch5.

F. R. Senduk, I. Indwiarti, and F. Nhita, “Clustering of Earthquake Prone Areas in Indonesia Using K-Medoids Algorithm,” Indones. J. Comput., vol. 4, no. 3, pp. 65–76, 2019, doi: 10.21108/indojc.2019.4.3.359.

J. Ha, M. Kambe, and J. Pe, “Data Mining: Concepts and Techniques,” Data Min. Concepts Tech., pp. 1–703, 2011, doi: 10.1016/C2009-0-61819-5.

T. B. Ambo, J. Ma, and C. Fu, “Investigating influence factors of traffic violation using multinomial logit method,” Int. J. Inj. Contr. Saf. Promot., vol. 28, no. 1, pp. 78–85, 2020, doi: 10.1080/17457300.2020.1843499.

E. H. S. Atmaja, “Implementation of k-Medoids Clustering Algorithm to Cluster Crime Patterns in Yogyakarta,” Int. J. Appl. Sci. Smart Technol., vol. 1, no. 1, pp. 33–44, 2019, doi: 10.24071/ijasst.v1i1.1859.

P. Dangeti, Statistics for Machine Learning, Packt Publishing Ltd., 2017.

S. Dua and X. Du, Data Mining and Machine Learning in Cybersecurity, CRC Press, 2016.

B. Johnston, A. Jones, and C. Kruger, Applied Unsupervised Learning with Python: Discover Hidden Patterns and Relationships in Unstructured Data with Python, Packt Publishing Ltd., 2019.

O. Maimon and L. Rokach, Eds., Data Mining and Knowledge Discovery Handbook, vol. 2, Springer, New York, 2005.

A. Malik and B. Tuckfield, Applied Unsupervised Learning with R: Uncover Hidden Relationships and Patterns with K-Means Clustering, Hierarchical Clustering, and PCA, Packt Publishing Ltd., 2019.

H. S. Park and C. H. Jun, “A simple and fast algorithm for K-Medoids clustering,” Expert Syst. Appl., vol. 36, no. 2, pp. 3336–3341, 2009.

T. Thinsungnoen, N. Kaoungkub, P. Durongdumronchai, K. Kerdprasop, and N. Kerdprasop, “The clustering validity with silhouette and sum of squared errors,” Learn., vol. 3, no. 7, pp. 44–51, 2015.

Published
2025-03-20
Abstract views: 152 times
Download PDF: 105 times
How to Cite
S, M., & Utami, E. (2025). Traffic Violation Clustering Using K-Medoids and Word Cloud Visualization. Journal of Information Systems and Informatics, 7(1), 250-271. https://doi.org/10.51519/journalisi.v7i1.1002
Section
Articles