Edutech Digital Start-Up Customer Profiling Based on RFM Data Model Using K-Means Clustering

  • Dedy Panji Agustino Institut of Tehcnology and Business STIKOM Bali
  • I Gede Harsemadi Institute of Technology and Business STIKOM Bali
  • I Gede Bintang Arya Budaya Institut of Technology and Business STIKOM Bali
Keywords: Customer Segmentation, Silhouette Coefficient, Elbow Method, Davies Bouldin Index, Business Intelligences


Digital start-up is companies with a high risk because they are still looking for the most fitting business model and the right market. The company's growth is the primary goal of the start-up. As a newly established company, digital start-ups have one challenge, it is the ineffectiveness of the marketing process and strategic schemes in terms of maintaining customer loyalty, the same goes for edutech digital start-ups. Ineffective and inefficient plans can waste resources. Hence, a method is needed to find out the optimal solution to understanding the customer characteristic. Business Intelligence is needed, with the customer profiling process using transaction data based on the RFM (Retency, Frequency, Monetary) model using the K-Means algorithm. In this study, the transaction data comes from an education platform digital start-up assisted by the STIKOM Bali business incubator. Based on three metrics, namely the Elbow Method, Silhouette Scores, and Davis Bouldin Index, transaction data for sales retency, sales frequency, and sales monetary can be analyzed and can find the optimal solution. For this case, K = 2 is the optimum cluster solution, where the first cluster is the customer who needs more engagement, and the second cluster is the best customer


Download data is not yet available.


Y. Brikman, Hello, Startup: A Programmer’s Guide to Building Products, Technologies, and Teams. “ O’Reilly Media, Inc.,” 2015.

M. Khajvand, K. Zolfaghar, S. Ashoori, and S. Alizadeh, “Estimating customer lifetime value based on RFM analysis of customer purchase behavior: Case study,” Procedia Computer Science, vol. 3, pp. 57–63, 2011.

K. Khalili-Damghani, F. Abdi, and S. Abolmakarem, “Hybrid soft computing approach based on clustering, rule mining, and decision tree analysis for customer segmentation problem: Real case of customer-centric industries,” Applied Soft Computing, vol. 73, pp. 816–828, 2018.

P. Kolarovszki, J. Tengler, and M. Majerčáková, “The new model of customer segmentation in postal enterprises,” Procedia-Social and Behavioral Sciences, vol. 230, pp. 121–127, 2016.

P. Anitha and M. M. Patil, “RFM model for customer purchase behavior using K-Means algorithm,” Journal of King Saud University - Computer and Information Sciences, no. xxxx, 2020, doi: 10.1016/j.jksuci.2019.12.011.

S. Hwang and Y. Lee, “Identifying customer priority for new products in target marketing: Using RFM model and TextRank,” Marketing, vol. 17, no. 2, pp. 125–136, 2021.

J. Wu et al., “An empirical study on customer segmentation by purchase behaviors using a RFM model and K-means algorithm,” Mathematical Problems in Engineering, vol. 2020, 2020.

J. A. Hartigan and M. A. Wong, “Algorithm AS 136: A k-means clustering algorithm,” Journal of the royal statistical society. series c (applied statistics), vol. 28, no. 1, pp. 100–108, 1979.

A. K. Jain, “Data clustering: 50 years beyond K-means,” Pattern Recognit Lett, vol. 31, no. 8, pp. 651–666, 2010.

J. Pérez-Ortega, N. N. Almanza-Ortega, A. Vega-Villalobos, R. Pazos-Rangel, C. Zavala-Díaz, and A. Martínez-Rebollar, “The K-means algorithm evolution,” Introduction to Data Science and Machine Learning, 2019.

J.-T. Wei, S.-Y. Lin, and H.-H. Wu, “A review of the application of RFM model,” African Journal of Business Management, vol. 4, no. 19, pp. 4199–4206, 2010.

K. Coussement, F. A. M. van den Bossche, and K. W. de Bock, “Data accuracy’s impact on segmentation performance: Benchmarking RFM analysis, logistic regression, and decision trees,” Journal of Business Research, vol. 67, no. 1, pp. 2751–2758, 2014.

M. A. Syakur, B. K. Khotimah, E. M. S. Rochman, and B. D. Satoto, “Integration k-means clustering method and elbow method for identification of the best customer profile cluster,” in IOP conference series: materials science and engineering, 2018, vol. 336, no. 1, p. 12017.

D.-T. Dinh, T. Fujinami, and V.-N. Huynh, “Estimating the optimal number of clusters in categorical data clustering by silhouette coefficient,” in International Symposium on Knowledge and Systems Sciences, 2019, pp. 1–17.

A. K. Singh, S. Mittal, P. Malhotra, and Y. V. Srivastava, “Clustering Evaluation by Davies-Bouldin Index (DBI) in Cereal data using K-Means,” in 2020 Fourth International Conference on Computing Methodologies and Communication (ICCMC), 2020, pp. 306–310.

Abstract views: 130 times
Download PDF: 84 times
How to Cite
Agustino, D. P., Harsemadi, I. G., & Budaya, I. G. B. A. (2022). Edutech Digital Start-Up Customer Profiling Based on RFM Data Model Using K-Means Clustering. Journal of Information Systems and Informatics, 4(3), 724-736.