scholarly journals Analisis Pemetaan Tingkat Kriminalitas di Kabupaten Karawang menggunakan Algoritma K-Means

Author(s):  
Resti Noor Fahmi ◽  
Mohamad Jajuli ◽  
Nina Sulistiyowati

Kriminalitas merupakan salah satu permasalahan yang sering terjadi di masyarakat yang perlu diperhatikan karena merugikan dan menimbulkan dampak negatif kepada masyarakat. Dilansir dari jabar.tribunews.com Kabupaten Karawang menjadi ranking pertama tingkat kriminalitas tertinggi di Jawa Barat pada awal masa pandemi. Ini menjadi PR pemerintah dan Polres Karawang khususnya untuk dapat menangani dan mengupayakan penanggulangan kriminalitas di Karawang. Penelitian ini menggunakan metode clustering dengan algoritma k-means dan dilakukan pemetaan daerah rawan kriminalitas menggunakan QGIS. Hasil pengelompokan daerah rawan kriminalitas di Karawang pada 2019 didapatkan cluster tidak rawan sebanyak 23 kecamatan, cluster rawan sebanyak 3 kecamatan dan cluster sangat banyak sebanyak 4 kecamatan. Sedangkan pada 2020 didapatkan cluster tidak rawan sebanyak 22 kecamatan, cluster rawan sebanyak 4 kecamatan, dan cluster sangat rawan sebanyak 4 kecamatan. Hasil evaluasi clustering menggunakan silhouette coefficient pada tahun 2019 yaitu sebesar 0,52 dan 0,54 pada tahun 2020, keduanya masuk dalam kategori medium strucutre dengan interpretasi penempatan klaster yang wajar

2022 ◽  
Vol 10 (4) ◽  
pp. 583-593
Author(s):  
Syiva Multi Fani ◽  
Rukun Santoso ◽  
Suparti Suparti

Social media is computer-based technology that facilitates the sharing of ideas, thoughts, and information through the building of virtual networks and communities. Twitter is one of the most popular social media in Indonesia which has 78 million users. Businesses rely heavily on Twitter for advertising. Businesses can use these types of tweet content as a means of advertising to Twitter users by Knowing the types of tweet content that are mostly retweeted by their followers . In this study, the application of Text Mining to perform clustering using the K-means clustering method with the best number of clusters obtained from the Silhouette Coefficient method on the @bliblidotcom Twitter tweet data to determine the types of tweet content that are mostly retweeted by @bliblidotcom followers. Tweets with the most retweets and favorites are discount offers and flash sales, so Blibli Indonesia could use this kind of tweet to conduct advertising on social media Twitter because the prize quiz tweets are liked by the @bliblidotcom Twitter account followers.


2021 ◽  
Vol 6 (2) ◽  
pp. 48
Author(s):  
Solmin Paembonan ◽  
Hisma Abduh

Dalam penelitian ini menggunakan metode k-means, metode ini dapat digunakan untuk menjadikan beberapa obat yang mirip menjadi suatu kelompok data tertentu. Salah satu cara untuk mengetahui tingkat kemiripan data adalah melalui perhitungan jarak antar data. Semakain kecil jarak antar data semakin tinggi tingkat kemiripan data tersebut dan sebaliknya semakin besar jarak antar data maka semakin rendah tingkat kemiripannya. Tujuan akhir clustering adalah untuk menentukan kelompok dalam sekumpulan data yang tidak berlabel, karena clustering merupakan suatu metode unsupervised dan tidak terdapat suatu kondisi awal untuk sejumlah cluster yang mungkin terbentuk dalam sekumpulan data, maka dibutuhkan suatu evaluasi hasil clustering. Berdasarkan evaluasi yang dilakukan terhadap hasil clustering dengan nilai dari silhouette coeficient = 0,4854. In this study using the k-means method, this method can be used to make several similar drugs into a certain data group. One way to determine the level of similarity of the data is through the calculation of the distance between the data. The smaller the distance between the data, the higher the level of similarity between the data and vice versa, the greater the distance between the data, the lower the similarity level. For a number of clusters that may be formed in a data set, an evaluation of the results of clustering is needed. Based on the evaluation carried out on the results of clustering with the value of the silhouette coefficient = 0.4854.


Electronics ◽  
2020 ◽  
Vol 9 (5) ◽  
pp. 751
Author(s):  
Xiaoyuan Liu ◽  
Senxiang Lu ◽  
Yan Ren ◽  
Zhenning Wu

In this paper, a wind turbine anomaly detection method based on a generalized feature extraction is proposed. Firstly, wind turbine (WT) attributes collected from the Supervisory Control And Data Acquisition (SCADA) system are clustered with k-means, and the Silhouette Coefficient (SC) is adopted to judge the effectiveness of clustering. Correlation between attributes within a class becomes larger, correlation between classes becomes smaller by clustering. Then, dimensions of attributes within classes are reduced based on t-Distributed-Stochastic Neighbor Embedding (t-SNE) so that the low-dimensional attributes can be more full and more concise in reflecting the WT attributes. Finally, the detection model is trained and the normal or abnormal state is detected by the classification result 0 or 1 respectively. Experiments consists of three cases with SCADA data demonstrate the effectiveness of the proposed method.


2020 ◽  
Vol 1 (1) ◽  
pp. 57-67
Author(s):  
Steven Pranata ◽  
Derry Alamsyah

 Segmentation divides an image into parts or segments that are simpler and more meaningful so they can be analyzed further. The solution that has been found is using the Maximum Likelihood Estimation (MLE) method and the Gausian Mixture Model. GMM is a clustering method. GMM is a function consisting of several Gaussian, each identified by k ∈ {1, ..., K}, where K is the number of clusters in our dataset. Maximum Likelihood estimation is a technique used to find a certain point to maximize a function, this technique is very widely used in estimating a data distribution parameter. Tests carried out using mango images with 10 different backgrounds. GMM will cluster the pixels of the mango image to produce averages and covariates. Then the average and covariance will be used by MLE to qualify each pixel of the mango image. In this study GMM and MLE tests were carried out to segment mangoes. Based on the results obtained, the GMM and MLE methods have  an error rate of 13.07% for 3 clusters, 8.06% for 4 clusters, and 6.63% for 5 clusters and good cluster quality with silhouette coefficient values ​​of 0.37686 for 3 clusters, 0.29577 for 4 clusters, and 0.26162 for 5 clusters.


2021 ◽  
Vol 128 ◽  
pp. 04009
Author(s):  
Dmitry Serpuhovitin

The article presents an original methodology for selecting the most popular measures of state support of the national innovation system of a country with usage of numerical methods of clustering. The clustering methodology is based on a combination of indexes: Global Innovation Index, Gross Natural Income and Human Development Index. In the lists of countries and their corresponding clusters obtained as a result of empirical analysis, the most demanded measures of state support of the national innovation system were identified on the base of retrospective dynamics of Global Innovation Index indicators characterizing the state support of the national innovation system. For the obtained indicators of the Global Innovation Index, recommendations were given for the direction of development of the national innovation system of Russia. Classical clustering methods were used as analysis instruments: Density-based spatial clustering of applications with noise and K-Means, The Silhouette Coefficient implemented in the sklearn library of Python programming language was used as a quality metric.


Jurnal INFORM ◽  
2020 ◽  
Vol 5 (2) ◽  
pp. 54
Author(s):  
Aloysius Matz Teguh Utomo

Loyal customers are one of the factors that determine the development of a business. Therefore, businesses need a strategy to keep customers loyal, even making customers who were previously less loyal to become more loyal. The strategy used must be right on target according to customer segmentation. The purpose of this paper is to model a cluster of customer loyalty to help businesses in making the right decisions of marketing strategy. Segmentation is done using the k-means algorithm with LRIFMQ (length, recency, interval, frequency, monetary, quantity) as parameters, and the CLV (customer lifetime value) of each cluster is calculated. Data obtained from PT. XYZ (a company engaged in food processing) for one year (1 January 2019 - 31 December 2019), with 337.739 transactions, and 26.683 customers. AHP (analytical hierarchy process) method is used for LRIFMQ weighting because this method has a consistency index calculation. The silhouette coefficient is used to calculate the cluster quality and determine the optimal number of clusters. The best results are obtained with the silhouette coefficient value of 0,632904 with the number of clusters 6.


2019 ◽  
Vol 4 (1) ◽  
pp. 42
Author(s):  
Yudha Alif Auliya ◽  
Wayan Firdaus Mahmudy ◽  
Sudarto Sudarto

Abstract. Potato production is strongly influenced by the selection of suitable land for crops. Criteria for land suitability of planting potatoes is influenced by climatic factors and land characteristics. planted area clustered based on 11 criteria land suitability. The clustering results in the form of four clusters, namely: very suitable (S1), appropriate (S2), is quite suitable (S3) and are not suitable (N). Clustering of land aims to improve the quality and quantity of the potato crop. Clustering is done using a hybrid Particle Swarm Optimization with K-Means (KCPSO). The hybrid method is used to obtain an accurate result cluster. In this study used a new approach to doing improve KCPSO with random injection method. The calculation of the value of cost based on the silhouette coefficient. The results obtained KCPSO showed better results when compared to using the K-Means algorithm without hybrid. The calculation result KCPSO get the best centroid indicated by the value of the largest Silhouette coefficient.Keywords: Clustering, K-Means, Particle Swarm Optimization, random injection, silhouette Coefficient.


TEM Journal ◽  
2020 ◽  
pp. 929-936
Author(s):  
Mochammad Haldi Widianto ◽  
Ivan Diryana Sudirman ◽  
Muhammad Hanif Awaluddin

Online life is used as a method of finding information, one of which is Twitter as the medium. The occurrence of natural disasters is very detrimental. Therefore, the application is needed to see natural disasters through social media Twitter. A small number of studies using clustering methods based on Twitter user data density are the beginning of this research. With the availability of data in certain areas makes it easy to group. After that, the data is grouped based on a high degree of similarity. One result of applying this method is the location of the disaster. NER-based rules are used to discover out the area of the disaster. Data accuracy testing is performed using the Silhouette coefficient.


2021 ◽  
Vol 4 (2) ◽  
pp. 150-167
Author(s):  
Laurence - - ◽  
Devanny Gumulya ◽  
J. Sandra Sembel ◽  
Magdalena Lestari Ginting

Pariwisata merupakan salah satu kontributor penting dalam menunjang perekonomian suatu negara. Penelitian ini menitikberatkan pada kajian kunjungan wisatawan asing ke Jepang dengan mengambil data jumlah wisatawan yang berkunjung dan jumlah pengeluaran wisatawan untuk kategori akomodasi, hiburan, makanan dan minuman, belanja, transportasi, dan lain-lain. Pada studi yang dilakukan sebelumnya tidak terdapat pengelompokan negara untuk berbagai macam pengeluaran ini, sehingga posisi penelitian ini adalah mengisi kekosongan tersebut dengan melakukan pengelompokan negara berdasarkan pengeluaran turis. Selain itu, tujuan studi ini juga membuat model peramalan dengan menggunakan metode ARIMA yang mengakomodasi tren dan musim. Data yang terdiri dari enam jenis pengeluaran direduksi menjadi 2 dengan nilai variansi yang dijelaskan sebesar 83,84%. Hasil pengolahan data menunjukkan 2 kelompok negara turis berdasarkan pengeluarannya. Dua grup tersebut terdiri dari 8 negara anggota OECD dan 12 negara non OECD. Turis yang berasal dari negara yang tergabung dalam OECD memberi memainkan peranan penting dalam perekonomian dunia dengan kontribusi sebesar 50,5 % dari total pengeluaran turis dunia. Kualitas gugus dikategorikan baik dengan rata-rata koefisien siluet dan nilai kohesi 0,56. Pengelompokan ini dapat digunakan sebagai dasar untuk melakukan studi perilaku konsumen setiap negara. Metode peramalan menggunakan ARIMA dapat digunakan dengan memasukan elemen tren dan musim ke dalam model. Nilai R2 pada model peramalan menunjukan hasil yang baik pada sebagian besar data turis dari 20 negara. Model ARIMA musiman ini dapat dipertimbangkan sebagai model untuk meramalkan jumlah turis yang datang.   Kata kunci: Principal component analysis, k-means clustering, nilai silhouette coefficient and cohesion, ARIMA


Sign in / Sign up

Export Citation Format

Share Document