Analysis of Using Google Maps Data to Measure the Presence or Accessibility of Urban Facilities for The BPS - Statistics Indonesia's Classification of Urban and Rural Villages

The BPS - Statistics Indonesia classifies villages into urban villages and rural villages to organize statistics. The classification of village areas into urban or rural status is intended to form a stratum used in survey sampling techniques. With this status, it is hoped that the sample taken can represent the entire population well. The BPS - Statistics Indonesia establishes criteria for classifying a village into an urban village. The 2020 urban village criteria use three indicators as its measure, namely: population density per km 2 , percentage of agricultural families, and the presence or access of urban facilities. In general, the data used in calculating the classification of urban and rural villages in 2020 uses data from the 2019 Village Potential (Podes) survey. This study utilizes data on urban facilities such as schools, markets, shops, and hospitals on the google maps website to calculate the score of indicators of the existence or access of urban facilities. This study used a web scraping method to obtain data on these urban facilities from the google maps website. This study selected eight villages in the Lubuk Sikaping District, Pasaman Regency, West Sumatra Province, as a case study. The results showed that four villages with great potential were classified into urban villages, and three villages with great potential were classified into rural villages.


INTRODUCTION
The territory of Indonesia is divided into several levels of administrative regions, namely provinces, regencies or cities, sub-districts, and villages. The village is the smallest administrative area and is often used as an observation unit. The villages covered include various villages with various legal statuses, such as traditional villages (both village and sub-district status), preparatory villages, Smallest Settlement Units (UPT), and Alienated Community Settlements (PMT) [1]. Each village has different socioeconomic characteristics of the community, features, and environmental topology. These conditions will continue to change along with the level of development in a village. These changes in requirements are used as an indicator to classify a village into the classification of urban villages or rural villages.
To organize statistics, BPS -Statistics Indonesia classifies villages into urban and rural villages [1]. The classification by urban and rural areas is considered better to describe the characteristics of the village [1]. The category of village areas into urban or rural status is intended to form a stratum used in survey sampling techniques [2]. In addition, this classification is also used to allocate census officers or surveys to determine strategic estimate figures. With the status of urban and rural areas, it is hoped that the samples taken can represent the entire population well. In the analysis, classifying villages into urban or rural villages will provide results that better describe the actual situation compared to the classification of villages [2].
BPS -Statistics Indonesia sets the 2020 urban village criteria, consisting of three indicators as its measure: population density per km 2 , percentage of agricultural families, and the existence of urban facilities or access to these urban facilities [1]. Each indicator provides a score that will determine the urban and rural villages. The data used in calculating the classification of rural-urban villages uses data from the 2019 Village Potential (Podes) survey. This research focuses on the hands of urban facilities or access to these urban facilities. Many urban facilities must be dated on the Podes survey, including kindergartens, junior high schools, public high schools, markets, shops, hospitals, hotels, billiards, pubs, discotheques, karaoke venues, and salons [1]. Data collection of various urban facilities requires a lot of costs, energy, and time.
Based on these problems, this study tried to survey the use of landmark data on various types of urban facilities on the google maps site. The landmark data was extracted using web scraping methods [3]- [6]. Furthermore, the data is processed to produce a score value for indicators of urban facilities' existence or access to these urban facilities. As a case study, eight villages were selected in Lubuk Sikaping District, Pasaman Regency, West Sumatra Province.

METHODS
This study focused on calculating scores for indicators of the presence or access of urban facilities through several types of landmark data extracted from google maps sites using web scraping methods. The landmark data is then overlayed with a digital map of village administration using QGIS software. Furthermore, the calculation of the score is carried out. The mindset of this study can be seen in Figure 1 below.
Benny Firmansyah, Widya Sri Wahyuni | 825 Some studies have utilized data extraction from google maps using web scraping methods for analysis or a particular study. Some of these studies can be seen in table 1 below. This study calculates scores for indicators of the presence or access to urban facilities. Table 2 above presents in full the variables, criteria, and scores used in the classification of urban villages and rural villages. Based on the requirements in Table 2, it is known that a village can achieve a maximum score of 25 and a minimum score of 2 [1]. The existence or access to urban facilities is also very influential, with a top score of 7 [1]. Meanwhile, the cut of points used to determine urban villages is 9 [1]. So villages with a score of 9 or more are designated as urban villages, while villages with a score of less than 9 are designated as rural villages [1].

Web Scraping
Web scraping is a technique for retrieving textual characters from web pages, primarily for analysis [9]. Research by [7] also defines web scraping as a web harvesting, a method that can be used to capture and extract a large amount of data from a website and store it in a structured format. The primary purpose of web scraping is to extract information from one or many websites and process it into simple structures such as spreadsheets, databases, or CSV (Comma Separated Values) files [10]- [13]. Many techniques have been used to retrieve content from a web page such as API computer languages, robots, intelligent agents, and web scraper software [14]. This study used Botsol Crawler, one type of web scraper software, to extract all the landmark data information of urban facilities from google maps sites. The data attributes taken the name of the landmark, the address, and the coordinate point of the landmark location. Data extraction through this software is done automatically using robots and without blocking [5]. Benny Firmansyah, Widya Sri Wahyuni | 827

Landmark Extracts
The results of extracts of landmark information on urban facilities in Lubuk Sikaping District using Botsol software can be seen in Table 3. The scraping process with Botsol software can be seen in Figure 2. The attributes taken include landmark names, addresses, and coordinate points. Furthermore, the results of the landmark information extract are re-checked so that the subsequently processed landmarks are by the criteria for urban facilities in calculating the score for village classification, as previously explained in Table 2. In addition, an extract of information from the village office in Lubuk Sikaping District was also carried out as a basis for calculating the distance to the nearest urban facility if there were no certain urban facilities in the village.

Overlay Landmark Points of Urban Facilities With Digital Map of Villages
At this stage, an overlay of the coordinate points of the landmarks of urban facilities that have been extracted previously with a digital map of the village is carried out. The digital map of the village used is a map of the village administrative boundary from the Geospatial Information Agency (BIG). In comparison, the software used for overlays is QGIS. Previously, landmark point data in CSV format was converted into vector data in QGIS so that the overlay process could be carried out.
Based on the overlay results, the existence of urban facilities in certain village areas can be seen. Meanwhile, if there is none, then the distance of the nearest urban facility from the village office is calculated. The distance calculation uses the measure line feature of two landmark points in QGIS software. The process at this stage can calculate the score on the indicators of the presence or access of urban facilities in each village. In this study, we tried to calculate the score of the existence or access of urban facilities for eight villages in the Lubuk Sikaping District. Figure 3 and Figure 4 show an example of overlaying landmarks of public high schools and hotels with a digital map of village boundaries in Lubuk Sikaping District using QGIS software. Meanwhile, Figure 5 shows an example of calculating the distance of the nearest urban facility from the village office.    Table 4 above shows the results of calculating the score for the existence or access of urban facilities for eight villages in the Lubuk Sikaping District. Overlaying landmark points of urban facilities with digital maps of villages, as well as the method of calculating scores described in Table 2 earlier, form the basis for calculating scores in Table 4. Based on Table 4 above, it can be seen that four villages have the most significant potential to be classified into urban villages because they have a maximum score value, namely Aia Manggih Selatan, Pauah, Durian Tinggi, and Jambak. Furthermore, two villages have the most significant potential to be classified into rural villages because they have a minimum score: Sundata Selatan and Aia Manggih Utara. The final score will still be added with two other indicators, namely population density, agricultural families' percentage, and families using landlines and electricity.

CONCLUSION
This study calculated the classification score of urban and rural villages for indicators of the presence or access of urban facilities. BPS -Statistics Indonesia obtained data on urban facilities for classifying urban and rural villages through the Podes survey. The urban facilities that must be data on the Podes survey are quite a lot, so it takes a lot of time, cost, and energy. Therefore, this study tried to survey the use of landmark data on urban facilities on the google maps site to calculate the score of the existence or access of urban facilities.
Benny Firmansyah, Widya Sri Wahyuni | 831 Landmark data was obtained using web scraping methods through Botsol software. The data is then further processed using QGIS software, such as overlaying landmark points with a digital map of the village. Eight villages in Lubuk Sikaping Subdistrict were selected as case studies for this study and successfully calculated scores for indicators of the presence or access of urban facilities. Based on these scores, four large potential villages are classified into urban villages, and three likely villages are classified into rural villages. Future research is needed to develop a prototype of a smart system that can automatically extract data from the google maps site using the web scraping method and calculate the real-time score of the presence or accessibility of urban facilities.