An Empirical Comparison and Verification Study on the Containerports Clustering Measurement Using K-Means and Hierarchical Clustering(Average Linkage Method Using Cross-Efficiency Metrics, and Ward Method) and Mixed Models

Based on Central Java Regional Police data, traffic accidents from 2017 to 2018 increased from 17.522 to 19.016 or 8,54 percent. To reduce the number of traffic accidents in Central Java, the initial step was carried out by grouping districts/cities that had the same accident level characteristics based on vehicle type with cluster analysis. The ward and average linkage method is a hierarchical cluster analysis method. ward method can maximize cluster homogeneity. While the average linkage method can generate clusters with small cluster variants. In this study using a measure of squared euclidean distance to measure the similarity between pairs of objects. To determine the quality of clustering results, the validation dunn index and cophenetic coefficients corelation are used. Based on the results of the clustering, the optimal number of clusters is obtained at q = 5 for the average linkage method with the results of validation dunn index = 0,08571196 and the rcoph = 0,687458. Keywords: Accidents, Cluster Analysis, Ward Method, Average linkage, Squared Euclidean Distance, Dunn Index, Cophenetic Correlation Coefficient

Download Full-text

Comparison of Single Linkage, Complete Linkage, and Average Linkage Methods on Community Welfare Analysis in Cities and Regencies in East Java

Jurnal Matematika Statistika dan Komputasi ◽

10.20956/j.v18i1.14228 ◽

2021 ◽

Vol 18 (1) ◽

pp. 130-140

Author(s):

Yanuwar Reinaldi ◽

Nurissaidah Ulinnuha ◽

Moh. Hafiyusholeh

Keyword(s):

Hierarchical Clustering ◽

National Development ◽

Clustering Methods ◽

Single Linkage ◽

Complete Linkage ◽

Average Linkage ◽

Linkage Methods ◽

Silhouette Index ◽

Linkage Method ◽

Index Value

Community welfare is one of the important points for a region and is also the essence of national development. The welfare of the people in Indonesia is fairly unequal, especially in East Java. To be able to map an area to the welfare of its people in East Java, one way that can be used is to use clustering. The hierarchical clustering method is one of the clustering methods for grouping data. In hierarchical clustering, single linkage, complete linkage, and average linkage methods are suitable methods for grouping data, which will compare the best method to use. The results of the calculation show that the average linkage method with three clusters is the best calculation with a silhouette index value of 0.6054, with the 1st cluster there are 23 regions, namely the city/district with the highest community welfare, the 2nd cluster there are 11 regions, namely cities/districts with moderate social welfare, and in the third cluster there are 4 regions, namely cities/districts with the lowest community welfare.

Download Full-text

Perbandingan Metode Single Linkage, Complete Linkage Dan Average Linkage dalam Pengelompokan Kecamatan Berdasarkan Variabel Jenis Ternak Kabupaten Sidoarjo

Jurnal INFORM ◽

10.25139/inform.v4i2.1696 ◽

2019 ◽

Vol 4 (2) ◽

Author(s):

Sulthan Fikri Mu'afa ◽

Nurissaidah Ulinnuha

Keyword(s):

Farm Animals ◽

Single Linkage ◽

Daily Lives ◽

Food Ingredients ◽

Complete Linkage ◽

Material Sources ◽

Average Linkage ◽

Linkage Method ◽

Livestock Products ◽

Labor Resources

Livestock products are widely used by the community in their daily lives, for example as food ingredients, industrial material sources, labor resources, fertilizer sources and energy sources. This study aims to cluster livestock potential with data on livestock population in Sidoarjo Regency in 2017 with single linkage, complete linkage and average linkage method and comparing performance of the methods. In this cluster, the data will be grouped into 3 clusters. The results of the three clusters were obtained by sixteen sub-districts in the first cluster with the potential for low livestock and each one in the second and third clusters for single linkage and average linkage. While complete linkage obtained fifteen sub-districts in the first cluster with high potential for livestock, two sub-districts in the second cluster with the potential of medium livestock and one sub-district in the third cluster with the potential for high farm animals. In the comparison of the standard deviation ratio value, the smallest value of 0.222 is obtained by complete linkage, which shows that complete linkage is better than single linkage and average linkage in the case of subgrouping based on Sidoarjo regency livestock types.

Download Full-text

Ward method of hierarchical clustering for non-Euclidean similarity measures

2015 7th International Conference of Soft Computing and Pattern Recognition (SoCPaR) ◽

10.1109/socpar.2015.7492784 ◽

2015 ◽

Cited By ~ 4

Author(s):

Sadaaki Miyamoto ◽

Ryosuke Abe ◽

Yasunori Endo ◽

Jun-ichi Takeshita

Keyword(s):

Hierarchical Clustering ◽

Similarity Measures ◽

Ward Method

Download Full-text

Disease Interactome

Epidemiological Research Applications for Public Health Measurement and Intervention - Advances in Human Services and Public Health ◽

10.4018/978-1-7998-4414-3.ch005 ◽

2021 ◽

pp. 69-84

Author(s):

Suma Dawn ◽

Nidhi Jain ◽

Tulika Gangwar

Keyword(s):

Quantitative Analysis ◽

Hierarchical Clustering ◽

Major Part ◽

Visual Representation ◽

Clustering Algorithms ◽

Average Linkage ◽

Co Morbidity ◽

Gene Similarity ◽

The Relationship ◽

Secondary Diseases

The disease interactome is a network of genes that are related to each other through some attributes. These genes, being part of various diseases, show a high correlation among many diseases. Genes being a major part of the interactome thus can be used to determine the relationship between various diseases, their symptoms, clinical similarity, and co-morbidity. Subgraphs and similarity factors such as Jaccardian distance, cosine similarities, and others have been exploited to calculate the relationship between two or more diseases. Many diseases that did not show much resemblance on the basis of gene similarity or symptom similarity were seen to be closely related according to network interactome. The quantitative analysis between disease-disease was also done. Clustering algorithms like hierarchical clustering involving single, complete, and average linkage were applied to get a visual representation in the form of a dendrogram. Thus, disease-disease interactome was created, analyzed for finding related secondary diseases, and their basic nature was understood.

Download Full-text

An Empirical Comparison of Baseline Models for Goodness-of-Fit in r-Diameter Hierarchical Clustering

Classification and Clustering ◽

10.1016/b978-0-12-714250-0.50010-3 ◽

1977 ◽

pp. 131-153 ◽

Cited By ~ 11

Author(s):

Lawrence J. Hubert ◽

Frank B. Baker

Keyword(s):

Hierarchical Clustering ◽

Goodness Of Fit ◽

Empirical Comparison

Download Full-text

Empirical Comparison of Distances for Agglomerative Hierarchical Clustering

Communications in Computer and Information Science - Information Processing and Management of Uncertainty in Knowledge-Based Systems. Theory and Foundations ◽

10.1007/978-3-319-91476-3_45 ◽

2018 ◽

pp. 538-548

Author(s):

Shusaku Tsumoto ◽

Tomohiro Kimura ◽

Haruko Iwata ◽

Shoji Hirano

Keyword(s):

Hierarchical Clustering ◽

Agglomerative Hierarchical Clustering ◽

Empirical Comparison

Download Full-text

Analisa Hasil Pengelompokan Wilayah Kejadian Non-Kebakaran Menggunakan Agglomerative Hierachical Clustering di Semarang

Jurnal Tekno Kompak ◽

10.33365/jtk.v15i2.1166 ◽

2021 ◽

Vol 15 (2) ◽

pp. 63

Author(s):

Desy Exasanti ◽

Arief Jananto

Keyword(s):

Hierarchical Clustering ◽

Manhattan Distance ◽

Agglomerative Hierarchical Clustering ◽

Single Linkage ◽

Bottom Up ◽

Environment Analysis ◽

Complete Linkage ◽

Average Linkage

Abstrak−Klasterisasi merupakan metode pengelompokan dari data yang sudah diketahui label kelasnya untuk menemukan klaster baru dari hasil observasi. Dalam klasterisasi banyak metode yaitu metode terpusat, hirarki, kepadatan dan berbasis kisi, namun dalam penelitian yang dilakukan ini dipilih metode berbasis hirarki. Metode hirarki ini bekerja melakukan pengelompokan objek dengan membentuk hirarki klaster namun bukan berarti selalu digambarkan dengan hirarki dalam organsasi. Dipilihnya Agglomerative Hierarchical Clustering dimana merupakan jenis dari bawah ke atas atau biasa disebut (bottom-up) dalam metode ini objek yang akan diuji dianggap sebagai objek tunggal sebagai klaster dan lalu dilakukan iterasi untuk menemukan klaster-klaster yang lebih besar. Data yang akan digunakan adalah data non-kebakaran pada Dinas Pemadam Kebakaran Kota Semarang ynng mana akan dilakukan pengelompokan wilayah penanganan non-kebakaran. Dinas Pemadam Kebakaran melakukan penanganan bukan hanya kebakaran saja namun ada banyak hal yang sebenarnya dapat ditangani oleh petugas pemadam kebakaran, kejadian non-kebakaran ada beberapa seperti evakuasi reptil, evakuasi kucing, penyelamatan korban kecelakaan dan lain sebagainya. Dari data non-kebakaran dari 16 kecamatan di Kota Semarang pada tahun 2019 akan dilakukan uji menggunakan tiga algoritma yaitu Single Lingkage, Average Linkage dan Complete Linkage . Adapun dari algoritma Single Linkage dilakukan prosedur pemusatan dari jarak terkecil antar objek data, algoritma Average Linkage dilakukan prosedur dari jarak rata-rata objek data, sedangkan jika algoritma Complete Linkage dilakukan prosedur pemusatan dari jarak yang terbesar. Implementasi dan visualiasi dari data uji coba yang dilakukan di penilitian ini menggunakan tools WEKA 3.8.4, Wakaito Environment Analysis for Knowledge atau yang biasa dikenal dengan WEKA ini merupakan software yang menggunakan bahasa pemrograman java. Dari dataset 380 data diambil sampel 100 data untuk diuji mengunakan WEKA menggunakan metode perhtungan jarak Manhattan Distance dengan 3 cluster. Hasil dari data uji coba dapat divisualisasikan dengan visualisasi dendogram pada fitur visualize tree dan jika dilakukan visualisasi dalam bentuk grafik dapat dilakukan menggunakan fitur visualize clusters assignment.

Download Full-text

Penerapan Metode Angglomerative Hierarchical Clustering untuk Klasifikasi Kabupaten/Kota di Propinsi Jawa Timur Berdasarkan Kualitas Pelayanan Keluarga Berencana

CAUCHY ◽

10.18860/ca.v4i1.3172 ◽

2015 ◽

Vol 4 (1) ◽

pp. 25

Author(s):

Alfi Fadliana ◽

Fachrur Rozi

Keyword(s):

Hierarchical Clustering ◽

Good Condition ◽

Cluster Solution ◽

Agglomerative Hierarchical Clustering ◽

Clustering Methods ◽

Cophenetic Correlation ◽

Service Personnel ◽

Medium Condition ◽

Average Linkage ◽

Hierarchical Clustering Methods

Agglomerative hierarchical clustering methods is cluster analysis method whose primary purpose is to group objects based on its characteristics, it begins with the individual objects until the objects are fused into a single cluster. Agglomerative hierarchical clustering methods are divided into single linkage, complete linkage, average linkage, and ward. This research compared the four agglomerative hierarchical clustering methods in order to get the best cluster solution in the case of the classification of regencies/cities in East Java province based on the quality of “Keluarga Berencana” (KB) services. The results of this research showed that based on calculation of cophenetic correlation coefficient, the best cluster solution is produced by average linkage method. This method obtained four clusters with the different characteristics. Cluster 1 has an “extremely bad condition” on the qualification of KB clinics and the competence of KB service personnel. Cluster 2 has a “good condition” on the qualification of KB clinics and “bad condition” on the competence of KB service personnel. Cluster 3 has a “bad condition” on the qualification of KB clinics and “medium condition” on the competence of KB service personnel. Cluster 4 have a “medium condition” on the qualification of KB clinics and a “good condition” on the competence of KB service personnel

Download Full-text

Numerical taxonomy and ecology of oligotrophic bacteria isolated from the estuarine environment

Canadian Journal of Microbiology ◽

10.1139/m77-110 ◽

1977 ◽

Vol 23 (6) ◽

pp. 733-750 ◽

Cited By ~ 32

Author(s):

L. M. Mallory ◽

B. Austin ◽

R. R. Colwell

Keyword(s):

Estuarine Environment ◽

Nutrient Media ◽

Taxonomic Analysis ◽

Oligotrophic Bacteria ◽

Water And Sediment ◽

Average Linkage ◽

Linkage Method ◽

Slow Growing ◽

Physiological Characters ◽

Similarity Matrices

Slow-growing bacteria, isolated on nutrient-rich and nutrient-limited media, from Chesapeake Bay water and sediment samples, were examined for 119 biochemical, cultural, morphological, nutritional, and physiological characters. Those bacteria which grow on low nutrient media, termed oligotrophs, a total of 162 strains, were subjected to taxonomic analysis, as a preliminary step in determining their ecological significance. The data for all strains included in the study were examined by computer and the simple matching (SSM) and Jaccard (SJ) coefficients calculated. Clustering was achieved by the unweighted average-linkage method. From sorted similarity matrices and dendrograms, 148 strains, 90% of the total, were recovered in 24 phenetic groups defined at the 80 to 85% similarity level. Only 12 phena could be presumptively identified and these included representatives of Alcaligenes, Corynebacterium, Hyphomicrobium, Hyphomonas polymorpha. Listeria, Nocardia marina, Pedomicrobium, Planococcus citreus, Sphaerotilus, Streptothrix, and Streptomyces. Of the remaining organisms, 10% were unidentified sheathed bacteria. It is concluded that slow-growing bacteria are distributed throughout the estuarine environment and can account for a large proportion of the colonies observed on media after prolonged periods of incubation. The oligotrophic bacteria appear to predominate in areas where the concentration of available nutrients is low and are more characteristic of non-eutrophic aquatic systems.

Download Full-text