scholarly journals Pengelompokan Kabupaten/Kota di Indonesia Berdasarkan Informasi Kemiskinan Tahun 2020 Menggunakan Metode K-Means Clustering Analysis

2021 ◽  
Vol 1 (1) ◽  
pp. 190-199
Author(s):  
Rijalul Fikri ◽  
Aswin Mushardiyanto ◽  
Mochamad Naufal Laudza’Banin ◽  
Kristiana Maureen ◽  
Harry Patria

Berdasarkan dataset tentang informasi kemiskinan kabupaten/kota tahun 2020 yang dikeluarkan oleh Badan Pusat Statistik Indonesia, dipilih variabel bebas sebanyak dua puluh variabel yang digunakan dalam penelitian ini. Kemudian dilakukan uji korelasi antar variabel bebas tersebut dan diketahui terdapat variabel yang berkorelasi dikategorikan berkorelasi sangat tinggi, dengan nilai korelasi sebesar 0,921 (Persentase Penduduk Miskin - P1 (Poverty Gap Index)) dan 0,964 (P1 (Poverty Gap Index) - P2 (Proverty Severity Index)). Variabel yang memiliki korelasi sangat tinggi jika digunakan akan menyebabkan terjadinya multikolinearitas, sehingga opsi untuk menghilangkan multikolinearitas adalah dengan menggunakan Principal Component Analysis (PCA). Dengan menggunakan Proporsi Kumulatif Varians dan minimum persentase keragaman data sebesar 80% maka didapatkan output berupa dimensi data baru PCA sebanyak tiga dimensi data atau tiga variabel bebas baru. Dengan menggunakan variabel input baru berupa PCA 0, PCA 1 dan PCA 2 dilakukanlah penentuan jumlah cluster dengan metode Silhouette Coefficient dan analisa clustering menggunakan metode K-Means didapatkanlah empat kelompok/cluster, dengan jumlah anggota cluster 1 sebanyak 117 Kabupaten/Kota, cluster 2 sebanyak 154 Kabupaten/Kota, cluster 3 sebanyak 173 Kabupaten/Kota dan cluster 4 sebanyak 70 Kabupaten/Kota.

2021 ◽  
Vol 4 (2) ◽  
pp. 150-167
Author(s):  
Laurence - - ◽  
Devanny Gumulya ◽  
J. Sandra Sembel ◽  
Magdalena Lestari Ginting

Pariwisata merupakan salah satu kontributor penting dalam menunjang perekonomian suatu negara. Penelitian ini menitikberatkan pada kajian kunjungan wisatawan asing ke Jepang dengan mengambil data jumlah wisatawan yang berkunjung dan jumlah pengeluaran wisatawan untuk kategori akomodasi, hiburan, makanan dan minuman, belanja, transportasi, dan lain-lain. Pada studi yang dilakukan sebelumnya tidak terdapat pengelompokan negara untuk berbagai macam pengeluaran ini, sehingga posisi penelitian ini adalah mengisi kekosongan tersebut dengan melakukan pengelompokan negara berdasarkan pengeluaran turis. Selain itu, tujuan studi ini juga membuat model peramalan dengan menggunakan metode ARIMA yang mengakomodasi tren dan musim. Data yang terdiri dari enam jenis pengeluaran direduksi menjadi 2 dengan nilai variansi yang dijelaskan sebesar 83,84%. Hasil pengolahan data menunjukkan 2 kelompok negara turis berdasarkan pengeluarannya. Dua grup tersebut terdiri dari 8 negara anggota OECD dan 12 negara non OECD. Turis yang berasal dari negara yang tergabung dalam OECD memberi memainkan peranan penting dalam perekonomian dunia dengan kontribusi sebesar 50,5 % dari total pengeluaran turis dunia. Kualitas gugus dikategorikan baik dengan rata-rata koefisien siluet dan nilai kohesi 0,56. Pengelompokan ini dapat digunakan sebagai dasar untuk melakukan studi perilaku konsumen setiap negara. Metode peramalan menggunakan ARIMA dapat digunakan dengan memasukan elemen tren dan musim ke dalam model. Nilai R2 pada model peramalan menunjukan hasil yang baik pada sebagian besar data turis dari 20 negara. Model ARIMA musiman ini dapat dipertimbangkan sebagai model untuk meramalkan jumlah turis yang datang.   Kata kunci: Principal component analysis, k-means clustering, nilai silhouette coefficient and cohesion, ARIMA


Author(s):  
Bingxian Leng ◽  
Yunfei Fu ◽  
Siyuan Li

This paper mainly uses the idea of pedigree clustering analysis, gray prediction and principal component analysis. The clustering analysis model, GM (1,1) model and principal component analysis model were established by using SPSS software to analyze the correlation matrices and principal component analysis. MATLAB software was used to calculate the correlation matrices. In January, The difference in price changes of major food prices in cities is calculated, and had forecasted the various food prices in June 2016. For the first issue, the main food is classified and the data are processed. After that, the SPSS software is used to classify the 27 kinds of food into four categories by using the pedigree cluster analysis model and the system clustering. The four categories are made by EXCEL. The price of food changes over time with a line chart that analyzes the characteristics of food price volatility. For the second issue, the gray prediction model is established based on the food classification of each kind of food price. First, the original data is cumulated, test and processed, so that the data have a strong regularity, and then establish a gray differential equation, and then use MATLAB software to solve the model. And then the residual test and post-check test, have C <0.35, the prediction accuracy is better. Finally, predict the price trend in June 2016 through the function. For the third issue, we analyzed the main components of 27 kinds of food types by celery, octopus, chicken (white striped chicken), duck and Chinese cabbage by using the data of principal given and analyzed by principal component analysis. It can be detected by measuring a small amount of food, this predict CPI value relatively accurate. Through the study of the characteristics of the region, select Shanghai and Shenyang, by looking for the relevant CPI and food price data, using spss software, principal component analysis, the impact of the CPI on several types of food, and then calculated by matlab algorithm weight, and then the data obtained by the analysis and comparison, different regions should be selected for different types of food for testing.


2020 ◽  
Vol 214 ◽  
pp. 03003
Author(s):  
Jiayi Yan ◽  
Qian Pu ◽  
Junfei Liu

Based on the knowledge of economics, this paper selects 22 macroeconomic indicators that best reflect the overall economic situation of the United States. After differential, logarithmic and exponential preprocessing of the original data, this paper, based on the power spectral analysis model, adaptively identifies the periodicity of the selected economic indicators, and visualize the results. As a result, it screens out 11 indicators with obvious periodicity. In the process of solving the weighted distance based on principal component analysis, correlation test is first conducted on the selected 11 single indicators of periodicity to obtain Pearson correlation heatmap. Then, the principal components are extracted by selecting the first five principal components as the virtual indicators to represent the monthly economic situation, and calculating the weighted distance value between months for visualization. Finally, we select the results of 36 months’ smoothing for analysis, figure out the time intervals with similar economic situation, and verify the conjecture of economic periodicity. Finally, based on K-MEAN clustering analysis, the economic conditions of 352 months are classified into 3 clusters by using the weighted distance after 36 months’ smoothing. From the visualized results, it is found that there are two complete cycles, i.e. red-yellow-blue and red-yellow-blue, which is consistent with the conclusion of principal component analysis model, and proves the existence of economic cycle again. In conclusion, based on the above PCA weighted distance and clustering analysis, it can be concluded that the economic period is around 176 months, in favor of medium long periodicity theory.


2005 ◽  
Vol 5 (3-4) ◽  
pp. 197-208
Author(s):  
S. Chung ◽  
H. Lee ◽  
M. Yu ◽  
J. Koo ◽  
I. Hyun ◽  
...  

In order to identify the relation between revenue water (RW) ratio and key local factors in a quantifiable way, 90 effect factors were considered as regional characteristics for 79 Korean cities. Seven statistically significant effect factors were chosen through correlation analysis. Three principal components independently influencing RW ratio were extracted by principal component analysis (PCA). The 79 cities were grouped into six clusters by k-means clustering (KMC) of the factor scores of the cities. Then key local factors were identified and their impacts were quantified by multiple regression analysis (MRA) and they were justified by T-test and F-test. The approach through correlation-PCA-KMC-MRA was proved to be one of scientific ways for identification of key local factors. According to the result, it was suggested that a shorter length of distribution system, a water supply with smaller number of bigger customer meters a and gravitational supply through reservoir would be advantageous from a RW ratio's point of view.


2021 ◽  
Vol 38 (1) ◽  
pp. 109-119
Author(s):  
G.A. Adebusuyi ◽  
O.F. Oyedeji ◽  
V.I. Alaje ◽  
I.L. Sowunmi ◽  
Y.A. Dunmade

Jatropha curcas is a multi-purpose tree with significant economic importance that has not been fully exploited due to lack of adequate breeding programme in Nigeria. Consequently upon this, 31 accessions collected from 4 states in Southwestern Nigeria were assessed for their morphological diversity in order to establish this as a bed rock for further breeding programmes. Data were collected on plant height, numbers of leaves and collar diameter; these were subjected to analysis of variance, principal component analysis and cluster analysis using Minitab version 17. The results showed significant differences (p≤0.05) among the 31 accessions assessed. Principal component analysis indicated that the first three axes contributed 97.8% of the total variation observed. The first axis accounted for 68% of the total variation while the second and third axes accounted for 24.7% and 5.1%, respectively, of the total variation recorded. Cluster analysis as well as the dendrogram revealed three distinct clusters of genetic similarities and differences. High genetic similarities were observed among accessions collected from the different states whereas some accessions collected from similar regions had low genetic similarities. Cluster 1 consisted of 21 genotypes with their characters falli ng below the grand mean. Cluster 2 had nine genotypes, they produced the highest values for all the characters assessed. Cluster 3 with only one genotype has its values below the ground mean. Members of cluster 2 have proven to be superior. The existence of morphological diversity offers potential for selection among the accessions in the breeding of J. curcas from southwestern Nigeria.


Sign in / Sign up

Export Citation Format

Share Document