Comparison of Single Linkage, Complete Linkage, and Average Linkage Methods on Community Welfare Analysis in Cities and Regencies in East Java

Community welfare is one of the important points for a region and is also the essence of national development. The welfare of the people in Indonesia is fairly unequal, especially in East Java. To be able to map an area to the welfare of its people in East Java, one way that can be used is to use clustering. The hierarchical clustering method is one of the clustering methods for grouping data. In hierarchical clustering, single linkage, complete linkage, and average linkage methods are suitable methods for grouping data, which will compare the best method to use. The results of the calculation show that the average linkage method with three clusters is the best calculation with a silhouette index value of 0.6054, with the 1st cluster there are 23 regions, namely the city/district with the highest community welfare, the 2nd cluster there are 11 regions, namely cities/districts with moderate social welfare, and in the third cluster there are 4 regions, namely cities/districts with the lowest community welfare.

Download Full-text

Perbandingan Metode Single Linkage, Complete Linkage Dan Average Linkage dalam Pengelompokan Kecamatan Berdasarkan Variabel Jenis Ternak Kabupaten Sidoarjo

Jurnal INFORM ◽

10.25139/inform.v4i2.1696 ◽

2019 ◽

Vol 4 (2) ◽

Author(s):

Sulthan Fikri Mu'afa ◽

Nurissaidah Ulinnuha

Keyword(s):

Farm Animals ◽

Single Linkage ◽

Daily Lives ◽

Food Ingredients ◽

Complete Linkage ◽

Material Sources ◽

Average Linkage ◽

Linkage Method ◽

Livestock Products ◽

Labor Resources

Livestock products are widely used by the community in their daily lives, for example as food ingredients, industrial material sources, labor resources, fertilizer sources and energy sources. This study aims to cluster livestock potential with data on livestock population in Sidoarjo Regency in 2017 with single linkage, complete linkage and average linkage method and comparing performance of the methods. In this cluster, the data will be grouped into 3 clusters. The results of the three clusters were obtained by sixteen sub-districts in the first cluster with the potential for low livestock and each one in the second and third clusters for single linkage and average linkage. While complete linkage obtained fifteen sub-districts in the first cluster with high potential for livestock, two sub-districts in the second cluster with the potential of medium livestock and one sub-district in the third cluster with the potential for high farm animals. In the comparison of the standard deviation ratio value, the smallest value of 0.222 is obtained by complete linkage, which shows that complete linkage is better than single linkage and average linkage in the case of subgrouping based on Sidoarjo regency livestock types.

Download Full-text

Evaluation of the Gower coefficient modifications in hierarchical clustering

Advances in Methodology and Statistics ◽

10.51936/eqvy9516 ◽

2017 ◽

Vol 14 (1) ◽

Author(s):

Zdeněk Šulc ◽

Martin Matějka ◽

Jiří Procházka ◽

Hana Řezanková

Keyword(s):

Hierarchical Clustering ◽

Mixed Type ◽

Similarity Measures ◽

Rand Index ◽

Clustering Methods ◽

Single Linkage ◽

Linkage Methods ◽

Hierarchical Clustering Methods ◽

Nominal Variables

This paper thoroughly examines three recently introduced modifications of the Gower coefficient, which were determined for data with mixed-type variables in hierarchical clustering. On the contrary to the original Gower coefficient, which only recognizes if two categories match or not in the case of nominal variables, the examined modifications offer three different approaches to measuring the similarity between categories. The examined dissimilarity measures are compared and evaluated regarding the quality of their clusters measured by three internal indices (Dunn, silhouette, McClain) and regarding their classification abilities measured by the Rand index. The comparison is performed on 810 generated datasets. In the analysis, the performance of the similarity measures is evaluated by different data characteristics (the number of variables, the number of categories, the distance of clusters, etc.) and by different hierarchical clustering methods (average, complete, McQuitty and single linkage methods). As a result, two modifications are recommended for the use in practice.

Download Full-text

Analisa Hasil Pengelompokan Wilayah Kejadian Non-Kebakaran Menggunakan Agglomerative Hierachical Clustering di Semarang

Jurnal Tekno Kompak ◽

10.33365/jtk.v15i2.1166 ◽

2021 ◽

Vol 15 (2) ◽

pp. 63

Author(s):

Desy Exasanti ◽

Arief Jananto

Keyword(s):

Hierarchical Clustering ◽

Manhattan Distance ◽

Agglomerative Hierarchical Clustering ◽

Single Linkage ◽

Bottom Up ◽

Environment Analysis ◽

Complete Linkage ◽

Average Linkage

Abstrak−Klasterisasi merupakan metode pengelompokan dari data yang sudah diketahui label kelasnya untuk menemukan klaster baru dari hasil observasi. Dalam klasterisasi banyak metode yaitu metode terpusat, hirarki, kepadatan dan berbasis kisi, namun dalam penelitian yang dilakukan ini dipilih metode berbasis hirarki. Metode hirarki ini bekerja melakukan pengelompokan objek dengan membentuk hirarki klaster namun bukan berarti selalu digambarkan dengan hirarki dalam organsasi. Dipilihnya Agglomerative Hierarchical Clustering dimana merupakan jenis dari bawah ke atas atau biasa disebut (bottom-up) dalam metode ini objek yang akan diuji dianggap sebagai objek tunggal sebagai klaster dan lalu dilakukan iterasi untuk menemukan klaster-klaster yang lebih besar. Data yang akan digunakan adalah data non-kebakaran pada Dinas Pemadam Kebakaran Kota Semarang ynng mana akan dilakukan pengelompokan wilayah penanganan non-kebakaran. Dinas Pemadam Kebakaran melakukan penanganan bukan hanya kebakaran saja namun ada banyak hal yang sebenarnya dapat ditangani oleh petugas pemadam kebakaran, kejadian non-kebakaran ada beberapa seperti evakuasi reptil, evakuasi kucing, penyelamatan korban kecelakaan dan lain sebagainya. Dari data non-kebakaran dari 16 kecamatan di Kota Semarang pada tahun 2019 akan dilakukan uji menggunakan tiga algoritma yaitu Single Lingkage, Average Linkage dan Complete Linkage . Adapun dari algoritma Single Linkage dilakukan prosedur pemusatan dari jarak terkecil antar objek data, algoritma Average Linkage dilakukan prosedur dari jarak rata-rata objek data, sedangkan jika algoritma Complete Linkage dilakukan prosedur pemusatan dari jarak yang terbesar. Implementasi dan visualiasi dari data uji coba yang dilakukan di penilitian ini menggunakan tools WEKA 3.8.4, Wakaito Environment Analysis for Knowledge atau yang biasa dikenal dengan WEKA ini merupakan software yang menggunakan bahasa pemrograman java. Dari dataset 380 data diambil sampel 100 data untuk diuji mengunakan WEKA menggunakan metode perhtungan jarak Manhattan Distance dengan 3 cluster. Hasil dari data uji coba dapat divisualisasikan dengan visualisasi dendogram pada fitur visualize tree dan jika dilakukan visualisasi dalam bentuk grafik dapat dilakukan menggunakan fitur visualize clusters assignment.

Download Full-text

An Empirical Comparison and Verification Study on the Containerports Clustering Measurement Using K-Means and Hierarchical Clustering(Average Linkage Method Using Cross-Efficiency Metrics, and Ward Method) and Mixed Models

Journal of Korea Port Economic Association ◽

10.38121/kpea.2018.09.34.3.17 ◽

2018 ◽

Vol 34 (3) ◽

pp. 17-52

Author(s):

Ro-Kyung Park

Keyword(s):

Hierarchical Clustering ◽

Mixed Models ◽

Empirical Comparison ◽

Average Linkage ◽

Linkage Method ◽

Ward Method

Download Full-text

PENCARIAN CLUSTER OPTIMUM PADA SINGLE LINKAGE, COMPLETE LINKAGE DAN AVERAGE LINKAGE

Bimaster : Buletin Ilmiah Matematika, Statistika dan Terapannya ◽

10.26418/bbimst.v8i3.33173 ◽

2019 ◽

Vol 8 (3) ◽

Author(s):

Nur Asiska, Neva Satyahadewi, Hendra Perdana

Keyword(s):

Global Optimum ◽

Single Linkage ◽

Complete Linkage ◽

Average Linkage

Analisis cluster merupakan teknik multivariat yang digunakan untuk mengelompokkan objek/kasus (responden) menjadi kelompok-kelompok yang lebih kecil dimana setiap kelompok berisi objek/kasus yang mirip satu sama lain. Dalam analisis cluster dua prosedur yang digunakan untuk pengelompokan yaitu analisis cluster hierarki dan non-hierarki. Penentuan jumlah cluster optimum yang tepat untuk digunakan diperoleh melalui identifikasi pola pergerakan varian pada cluster yang mencapai global optimum. Penemuan posisi cluster yang mencapai global optimum pada pola pergerakan varian diperoleh melalui penerapan metode valley-tracing. Pada penelitian, digunakan penerapan analisis cluster hierarki untuk mengelompokkan kabupaten/kota di Kalimantan Barat berdasarkan indikator IPM. Dari hasil analisis pembentukan cluster optimum pada metode single linkage diperoleh cluster optimum sebanyak 4 cluster. Pada metode complete linkage diperoleh cluster optimum sebanyak 5 cluster. Metode average linkage menghasilkan cluster optimum sebanyak 5 cluster Kata Kunci : Analisis Multivariat, Analisis Cluster, Cluster Optimum

Download Full-text

PENGELOMPOKAN DESA/KELURAHAN DI KOTA DENPASAR MENURUT INDIKATOR PENDIDIKAN

E-Jurnal Matematika ◽

10.24843/mtk.2016.v05.i02.p119 ◽

2016 ◽

Vol 5 (2) ◽

pp. 38

Author(s):

NI WAYAN ARIS APRILIA A.P ◽

I GUSTI AYU MADE SRINADI ◽

KARTIKA SARI

Keyword(s):

Cluster Analysis ◽

Data Analysis ◽

The Other ◽

Single Linkage ◽

Complete Linkage ◽

Hierarchical Method ◽

Average Linkage ◽

Different Characteristics

Cluster analysis is one of data analysis used to classify objects in clusters which has objects with the same characteristics, whereas the other cluster has different characteristics. One part of the method of analysis cluster is hierarchy method. In a hierarchical method there are methods of linkage in the form of incorporation. Generally, methods of linkage is divided into 5 methods: single linkage, complete linkage, average linkage, Ward and centroid. The purpose of this study was to determine the best method of linkage among the method of single linkage, complete linkage, average linkage, and Ward, using Euclidean and Pearson proximity distance. Base on the smallest value of CTM (Cluster Tightness Measure), the best method of linkage as a result of this research was average linkage in Pearson distance.

Download Full-text

Klasszikus klaszterező algoritmusok módosítása körút alapon

Multidiszciplináris Tudományok ◽

10.35925/j.multi.2021.4.9 ◽

2021 ◽

Vol 11 (4) ◽

pp. 81-86

Author(s):

Anita Agárdi

Keyword(s):

Single Linkage ◽

Complete Linkage ◽

Average Linkage

Jelen cikkben a klasszikus klaszterező algoritmusok egy módosítását mutatom be. A cikkben egy olyan módszert mutatok be, amellyel a klaszterező algoritmusok maguk határozzák meg a klaszterhatárokat, azt, hogy hány csoportra bontsák az adatsor elemeit. A klaszterezés egy olyan adatbányászati módszer, ahol az egymással hasonló elemek azonos klaszterbe, míg az egymástól különböző elemek külön klaszterbe kerülnek. Jelen cikkben egy partíciós algoritmust (K-Means) és a hierarchikus módszereket (Single Linkage, Complete Linkage, Average Linkage, Ward, Centroid) mutatom be. A futási eredmények azt mutatják, hogy a klaszterezési algoritmusoknak többé-kevésbé sikerült kialakítaniuk a klasztereket anélkül, hogy bemenetként a klaszterszámot várnánk.

Download Full-text

Penerapan Metode Angglomerative Hierarchical Clustering untuk Klasifikasi Kabupaten/Kota di Propinsi Jawa Timur Berdasarkan Kualitas Pelayanan Keluarga Berencana

CAUCHY ◽

10.18860/ca.v4i1.3172 ◽

2015 ◽

Vol 4 (1) ◽

pp. 25

Author(s):

Alfi Fadliana ◽

Fachrur Rozi

Keyword(s):

Hierarchical Clustering ◽

Good Condition ◽

Cluster Solution ◽

Agglomerative Hierarchical Clustering ◽

Clustering Methods ◽

Cophenetic Correlation ◽

Service Personnel ◽

Medium Condition ◽

Average Linkage ◽

Hierarchical Clustering Methods

Agglomerative hierarchical clustering methods is cluster analysis method whose primary purpose is to group objects based on its characteristics, it begins with the individual objects until the objects are fused into a single cluster. Agglomerative hierarchical clustering methods are divided into single linkage, complete linkage, average linkage, and ward. This research compared the four agglomerative hierarchical clustering methods in order to get the best cluster solution in the case of the classification of regencies/cities in East Java province based on the quality of “Keluarga Berencana” (KB) services. The results of this research showed that based on calculation of cophenetic correlation coefficient, the best cluster solution is produced by average linkage method. This method obtained four clusters with the different characteristics. Cluster 1 has an “extremely bad condition” on the qualification of KB clinics and the competence of KB service personnel. Cluster 2 has a “good condition” on the qualification of KB clinics and “bad condition” on the competence of KB service personnel. Cluster 3 has a “bad condition” on the qualification of KB clinics and “medium condition” on the competence of KB service personnel. Cluster 4 have a “medium condition” on the qualification of KB clinics and a “good condition” on the competence of KB service personnel

Download Full-text

Construction of the Core Collection of Catalpa fargesii f. duclouxii (Huangxinzimu) Based on Molecular Markers and Phenotypic Traits

Forests ◽

10.3390/f12111518 ◽

2021 ◽

Vol 12 (11) ◽

pp. 1518

Author(s):

Huifen Xue ◽

Xiaochi Yu ◽

Pengyue Fu ◽

Bingyang Liu ◽

Shen Zhang ◽

...

Keyword(s):

Molecular Markers ◽

Core Collection ◽

Snp Markers ◽

Rate Of Change ◽

Phenotypic Traits ◽

Clustering Methods ◽

Single Linkage ◽

Core Collections ◽

The Core ◽

Linkage Method

To promote the conservation and utilization of Catalpa fargesii f. duclouxii (Huangxinzimu) germplasm resources, a total of 252 accessions were used to construct a preliminary core collection according to phenotypic traits and single nucleotide polymorphism (SNP) markers. In this study, 24 phenotypic traits, namely, 9 quantitative traits and 15 qualitative traits, were investigated. The core collection of C. fargesii f. duclouxii (Huangxinzimu) was constructed to remove redundant samples from the collected materials. First, the phenotypic core collection, with a sample proportion of 30, consisting of 24 clones, was constructed according to two genetic distances (Euclidean distance and Mahalanobis), four system clustering methods (the unweighted pair-group average method, Ward’s method, the complete linkage method, and the single linkage method), and three sampling methods (random sampling, deviation sampling, and preferred sampling). The best construction strategies were selected for further comparison. Three core collections (D2C3S3-30, D2C3S3-50, and D2C3S3-70) were constructed according to the optimal construction strategy at three sampling proportions. The core collection D2C3S3-30 with the best parameters was evaluated by using six parameters: the mean difference percentage (MD), variance difference percentage (VD), periodic rate of range (CR), changeable rate of the coefficient of variation (VR), minimum rate of change (CRMIN), and maximum rate of change (CRMAX). Three core collections (M-30, M-50, and M-70) were constructed by molecular markers, and the optimal core collection M-30 was selected by using five parameters, namely, Ho, He, PIC, MAF, and loci. The combination of D2C3S3-30 and M-30 was used to construct the final core collection DM-45, 45 samples representing the complete range of phenotypic and genetic variability. In this study, phenotypic traits combined with molecular markers were used to construct core collections to effectively capture the entire range of trait variation, effectively representing the original germplasm and providing a basis for the conservation and utilization of C. fargesii f. duclouxii (Huangxinzimu).

Download Full-text

Validity studies among hierarchical methods of cluster analysis using cophenetic correlation coefficient

Brazilian Journal of Radiation Sciences ◽

10.15392/bjrs.v7i2a.668 ◽

2019 ◽

Vol 7 (2A) ◽

Cited By ~ 1

Author(s):

Priscilla Ramos Carvalho ◽

Casimiro Sepúlveda Munita ◽

André Luiz Lapolli

Keyword(s):

Cluster Analysis ◽

Correlation Coefficient ◽

Single Linkage ◽

Cophenetic Correlation ◽

Data Set ◽

Archaeological Data ◽

Average Linkage ◽

Linkage Method ◽

Cophenetic Correlation Coefficient ◽

Statistical Program

The literature presents many methods for partitioning of data set, and is difficult choose which is the most suitable, since the various combinations of methods based on different measures of dissimilarity can lead to different patterns of grouping and false interpretations. Nevertheless, little effort has been expended in evaluating these methods empirically using an archaeological data set. In this way, the objective of this work is make a comparative study of the different cluster analysis methods and identify which is the most appropriate. For this, the study was carried out using a data set of 45 samples of ceramic fragments, analyzed by instrumental neutron activation analysis (INAA). The methods used for this study were: Single linkage, Complete linkage, Average linkage, Centroid and Ward. The validation was done using the cophenetic correlation coefficient and comparing these values the average linkage method obtained better results. A script of the statistical program R with some functions was created to obtain the cophenetic correlation. By means of these values was possible to choose the most appropriate method to be used in the data set.

Download Full-text