Customer Loyalty Analysis Using Recency,Frequency, Monetary (RFM) and K-means Cluster for Labuan Bajo Souvenirs in Online Store

An average Labuan Bajo souvenir shop sells a variety of souvenirs specific to Labuan Bajo. However, the sales process is still manual, with the shop resorting to telephone or WhatsApp communication to connect with customers for placing orders. To increase sales, typical souvenir shops in Labuan Bajo are interested in adopting effective marketing strategies. Consequently, an automated system is necessary to manage customers. The Recency, Frequency, and Monetary Analysis methods are commonly used for assigning values or weights to customers during transactions. These weights are then analyzed and grouped using k-means. Recent data analysis over the last three months reveals that the typical Labuan Bajo souvenir shop has one regular customer, three potential customers, and six regular customers. Testing the system's features showed that it was functioning correctly, and therefore, it can assist the typical Labuan Bajo souvenir shop in streamlining the sales process.


INTRODUCTION
A typical Labuan Bajo souvenir shop offers a range of souvenirs specific to Labuan Bajo. Unfortunately, the gift shop faces a message stacking problem via WhatsApp, resulting in numerous errors during order recording. Furthermore, the increasing competition in selling typical Labuan Bajo souvenirs has prompted the store to devise effective marketing strategies. The store aims to enhance customer service by introducing special discount programs based on transactions to boost sales and capture customer attention [1]. To address these issues, the store can implement a computer system connecting customers and sellers to simplify ordering and data recording. The system can also analyze each customer's order history using the recency, frequency, and monetary (RFM) method. This analysis will generate a dataset that the K-means algorithm will 286| Customer Loyalty Analysis Using Recency,Frequency, Monetary (RFM) and K-..... cluster. The results will classify customers into predetermined groups and offer special discounts to attract and retain customers, increasing orders.
RFM is a widely used method that categorizes data based on purchase novelty (recency), purchase frequency (frequency), and total purchase price (monetary) [2]. It helps to observe various customer behaviors [3]. Recency refers to the time span between the last purchase transaction and the data retrieval date, be it daily, monthly, or yearly. Frequency is the frequency with which customers make purchase transactions. Monetary is the amount spent by customers to make purchase transactions [4].
Previous studies have utilized the K-Means and RFM clustering methods for various purposes, such as determining potential and loyal customers through customer segmentation using the RFM model and K-Means to identify suitable attributes for customer segmentation [5]. In addition, other studies have been conducted to determine customer groups with high toy and profitability values that are potentially profitable for the company. The segmentation process begins with data analysis, which is then transformed using LRFM (length, recency, frequency, and monetary) and classified using the fuzzy C-Means method [6].
One case study conducted at PT Coversuper Indonesia Global utilized the RFM method in determining market segmentation by identifying the characteristics of each individual, allowing businesses to better understand their customers and create targeted marketing strategies [7]. Similarly, the RFM method has been used to help store business managers determine journal entries for clothing product restock requirements based on the transaction date, color, size, and total revenue [8]. The thinking framework consists of five stages, which are described as follows: 1. Problem identification stage: At this stage, a process is conducted to identify the problems that occur at the research location. This information will later be used in the customer loyalty analysis research process using the recency, frequency, and monetary (RFM) method and K-means clustering at a typical gift shop in Labuan Bajo. 2. Data collection stage: The data collection process was carried out at this stage by interviewing the owner of the Labuan Bajo souvenir shop regarding requests for permits and data collection in the research process. 3. Data processing stage: The data processing process is carried out at this stage based on the obtained data. The data management stage begins with the following processes: 288| Customer Loyalty Analysis Using Recency,Frequency, Monetary (RFM) and K-..... a. Selection: At this stage, the initial dataset was taken from sales data. The sales data was obtained from interviews with typical Labuan Bajo souvenir shop owners. The necessary data for this research was collected from several existing sales tables and combined. b. Preprocessing: This stage is used to prepare the dataset so that it has better quality and is more effective before modeling. At this stage, feature selection is carried out using the RFM method (feature selection by RFM), resulting in three attributes: recency, frequency, and monetary. The data description stage is continued with the equal width technique, which gives weight to each continuous data feature to make it discrete. Thus, a customer dataset with simple features is obtained with the following attributes: customer ID, recency, frequency, and monetary. c. Clustering: The customer dataset clustering uses KMeans with a value of K=3. The value of K = 3 is used to group customers according to the weighting carried out by the RFM technique, i.e., customer weighting, which is categorized with a value of 3, 2, and 1. The conversion of 3 = "Very Loyal," 2 = "Loyal," and 1 = "Potential Loyal. 4. Design and implementation stage: At this stage, the system design is carried out based on the problems identified. Therefore, in this study, a system was created for analyzing customer loyalty using the recency, frequency, monetary (RFM) and K-means cluster methods at Labuan Bajo souvenir shops. Based on the collected data, it is made into a web-based application. 5. Testing stage: In the final stage, BlackBox testing is carried out on the web-based application with the recency, frequency, monetary (RFM) and K-means cluster models to check if the system is suitable, the existing functions, and if it is running correctly according to the purpose of this research.

Customer Grouping Analysis
At this stage, it is pertinent to note that the author has attached the comprehensive results of the data collection process conducted at a typical Labuan Bajo souvenir shop. This shop is located in the picturesque town of Labuan Bajo, situated in the West Manggarai Regency of the stunning East Nusa Tenggara Province. The data has been collected for the period spanning the last three months of July through September 2022. The results of this customer transaction data collection have been meticulously sorted and organized into a comprehensive dataset based on the recency, frequency, and monetary (RFM) model. The dataset will serve as a foundation for the subsequent analysis of customer loyalty using the RFM and K-means cluster analysis techniques, thus providing valuable insights for enhancing the overall customer experience of the souvenir shop. Data grouping will analysis as presented in Table 1.

Customer Data Management
At this stage, an analysis of data processing is being conducted, starting with the identification of the dataset based on the transaction criteria pattern using the Recency, Frequency, and Monetary (RFM) models.

Transaction Grouping Pattern Data
The pattern of criteria used to classify customer transactions is Recency, Frequency, and Monetary. In this preprocessing process, the discretization stage is carried out, namely the process of giving weight to each continuous data feature so that it becomes categorical discrete data. [10] Thus a customer dataset with simple features is obtained. RFM pattern analysis is influenced by the following factors: The weight value of each pattern determined by the Labuan Bajo Typical Gift Shop is yes by giving a weight value between the time range 1-5 with the conversion 5 = "Very Satisfied", 4 = "Satisfied", 3 = "Ordinary", 2 = "Not Satisfied", 1=" Not Satisfied". The part with the largest ratio value is given a weight of 5 because it indicates that the customer is increasingly satisfied. And so on until the customer with the smallest ratio value is given a weight of 1. The results of this weighting become the basic dataset to be processed for further processing, namely customer grouping using K-Means. The following is an example of data that has gone through the collection process.

Clustering Stage
Next at this stage is the data processing stage with the k-means algorithm to group customers into the same characteristics.

Figure 2. Clustering Stage
The steps for analysis using the K-Means method begin by collecting customer transaction history data. For example, the following is customer performance data in predetermined weights [11]. Furthermore, in the analysis process, the K-Means method is used to determine the K value for the centroid or cluster center. The cluster center in this study is determined by finding the average value of the criteria weight from the customer performance data. The K value for each criterion for iterations is shown below. Then perform the distance calculation technique using the Eucledian Distance with the formula in Aquetion 1.
(1)  From the results of the new centroid, then calculate the shortest distance from each data object to the centroid using the Euclidean Distance formula. Once the Eucledian Distance is known, the C1, C2, C3 values can be identified, namely the closest distance to each data object with cluster values of 1, 2, 3. If the cluster results in iteration 1 are different from the clusters in iteration 2, repeat the k-means process with newer centroids. To determine the new cluster center (centroid) by calculating the average of each cluster using the equation:  From the results of the new centroid, then calculate the shortest distance from each data object to the centroid using the Euclidean Distance formula. This calculation process is repeated until the iteration results are found to be the same as the previous iteration and in this condition the calculation can be stopped because the cluster conditions have reached convergence. Cluster calculation results show that cluster 3 with "Highly Loyal" status has a total of 1, cluster 2 with "Loyal" status has a total of 3, and cluster 1 with "Ordinary" status has a total of 6. The research stage is an overview that explains the logical flow of the research in general. The following is a picture and description of the stages carried out within the research framework, including:

System Implementation
The system that has been developed, in the form of a website, will be demonstrated in the following figures. Upon opening the website, users will be directed to a login page for those who already have an account or a registration page for new users who need to create an account. Once the user logs in using their account, they will be redirected to the main page of the website, where details of the products available for purchase will be displayed, as depicted in Figure 3. The webpage provides a comprehensive display of the results derived from the RFM and K-means analysis. In Iteration 1, the findings are represented in the form of customer grouping or classes, which can be accessed via the dedicated page. Additionally, the final output of the analysis can be viewed on the Result Page ( Figure 4). Furthermore, the webpage also presents the Transaction Data Page ( Figure 5), which provides a detailed overview of the transactional data gathered during the analysis process.

Test result
The next stage of implementation is testing using the Black Box test. Black box testing is a type of software testing in which the tester examines the software application's external functionalities without having any knowledge of its internal structure, design, or coding. The tester focuses on inputs and outputs of the system and tries to identify any defects, errors, or unexpected results. This type of testing is usually performed at the end of the software development lifecycle to ensure that the software meets its functional requirements and user expectations [12]. The main objective of black box testing is to validate the software from the end-user's perspective and ensure that it behaves as expected.
The black box test results are in Table 5.

CONCLUSION
Several conclusions will be drawn from the result. Firstly, recency, frequency, and monetary (RFM) can be effectively applied to group customers in typical Labuan Bajo souvenir shops. This system is proficient in analyzing purchase loyalty, number of product purchases, and customer spending, thus assisting souvenir shops in Labuan Bajo during transactions between stores and customers. Secondly, the K-means clustering method can be employed to group customer transaction data in typical Labuan Bajo souvenir shops. The study showed that over the last 3 months, the typical souvenir shop had one regular customer, three potential customers, and six regular customers. Thirdly, the customer loyalty analysis system employs the K-means cluster algorithm to classify incoming customer transaction data and categorize it into regular, loyal, and irregular customer data sets. This system can assist typical souvenir shops in Labuan Bajo in creating effective strategies to retain regular and loyal customers while attracting new ones.