scholarly journals Comparison of Different Distance Measures for Cluster Analysis of Tree-Ring Series

2008 ◽  
Vol 64 (1) ◽  
pp. 27-37 ◽  
Author(s):  
Ignacio. García-González
Author(s):  
Michael C. Thrun

Although distance measures are used in many machine learning algorithms, the literature on the context-independent selection and evaluation of distance measures is limited in the sense that prior knowledge is used. In cluster analysis, current studies evaluate the choice of distance measure after applying unsupervised methods based on error probabilities, implicitly setting the goal of reproducing predefined partitions in data. Such studies use clusters of data that are often based on the context of the data as well as the custom goal of the specific study. Depending on the data context, different properties for distance distributions are judged to be relevant for appropriate distance selection. However, if cluster analysis is based on the task of finding similar partitions of data, then the intrapartition distances should be smaller than the interpartition distances. By systematically investigating this specification using distribution analysis through the mirrored-density (MD plot), it is shown that multimodal distance distributions are preferable in cluster analysis. As a consequence, it is advantageous to model distance distributions with Gaussian mixtures prior to the evaluation phase of unsupervised methods. Experiments are performed on several artificial datasets and natural datasets for the task of clustering.


HortScience ◽  
2005 ◽  
Vol 40 (4) ◽  
pp. 1122B-1122 ◽  
Author(s):  
Peter Boches ◽  
Lisa J. Rowland ◽  
Kim Hummer ◽  
Nahla V. Bassil

Microsatellite markers for blueberry (Vaccinium L.) were created from a preexisting blueberry expressed sequence tag (EST) library of 1305 sequences and a microsatellite-enriched genomic library of 136 clones. Microsatellite primers for 65 EST-containing simple sequence repeats (SSRs) and 29 genomic SSR were initially tested for amplification and polymorphism on agarose gels. Potential usefulness of these SSRs for estimating species relationships in the genus was assessed through cross-species transference of 45 SSR loci and cluster analysis using genetic distance values from five highly polymorphic EST-SSR loci. Cross-species amplification for 45 SSR loci ranged from 17% to 100%, and was 83% on average in nine sections. Cluster analysis of 59 Vaccinium species based on genetic distance measures obtained from 5 EST-SSR loci supported the concept of V. elliotii Chapm. as a genetically distinct diploid highbush species and indicated that V. ashei Reade is of hybrid origin. Twenty EST-SSR and 10 genomic microsatellite loci were used to determine genetic diversity in 72 tetraploid V. corymbosum L. accessions consisting mostly of common cultivars. Unique fingerprints were obtained for all accessions analyzed. Genetic relationships, based on microsatellites, corresponded well with known pedigree information. Most modern cultivars clustered closely together, but southern highbush and northern highbush cultivars were sufficiently differentiated to form distinct clusters. Future use of microsatellites in Vaccinium will help resolve species relationships in the genus, estimate genetic diversity in the National Clonal Germplasm Repository (NCGR) collection, and confirm the identity of clonal germplasm accessions.


Plants ◽  
2019 ◽  
Vol 8 (8) ◽  
pp. 276
Author(s):  
Mehmet Doğan ◽  
Nesibe Köse

In this study, we identified the most important climate factors affecting the radial growth of black pine at different elevations of the mountain regions of Southwestern Turkey (Sandıras Mountain, Muğla/Turkey). We used four black pine tree-ring chronologies, which represent upper and lower distribution limits of black pine forest on the South and North slopes of Sandıras Mountain. The relationships between tree-ring width and climate were identified using response function analysis. We performed hierarchical cluster analysis to classify the response functions into meaningful groups. Black pine trees in the mountain regions of Southwestern Turkey responded positively to a warmer temperature and high precipitation at the beginning of the growing season. As high summer temperatures exacerbated drought, radial growth was affected negatively. Hierarchical cluster analysis made clear that elevation differences, rather than aspect, was the main factor responsible for the formation of the clusters. Due to the mountainous terrain of the study area, the changing climatic conditions (air temperature and precipitation) affected the tree-ring widths differently depending on elevation.


2021 ◽  
Vol 56 (3) ◽  
pp. 157-168
Author(s):  
Adji Achmad Rinaldo Fernandes ◽  
Solimun ◽  
Nurjannah ◽  
Usfi Al Imama Billah ◽  
Ni Made Ayu Astari Badung

This study wants to compare the Integrated Cluster Analysis and SEM model of the Warp-PLS approach with various cluster validity indices and distance measures on Service Quality, Environment, Fashions, Willingness to Pay, and Compliant Paying Behavior of Bank X Customers. The data used in this study are primary. The variables used in this study are service quality, environment, fashion, willingness to pay, and compliance with paying behavior at Bank X. The data were obtained through a questionnaire with a Likert scale — measurement of variables in primary data using the average score of each item. The sampling technique used was purposive sampling. The object of observation is the customer as many as 100 respondents. Data analysis was carried out quantitatively, and a descriptive analysis was carried out first. An Integrated Cluster Analysis and SEM analysis of the Warp-PLS approach was carried out with the average linkage method on various cluster validity indices and three distance measures. The Warp-PLS approach's integrated cluster and SEM model with the Gap Index, Index C, Global Sillhouette, and Goodman-Kruskal with the Manhattan Distance are better than the Gap, Index C, Global Sillhouette, and Goodman-Kruskal with the Euclidean and Minkowski Distance. The novelty in this research is the application of Integrated Cluster Analysis and SEM of the Warp-PLS approach to compare 4 cluster validity indices, namely Gap Index, C Index, Global Sillhouette, and Goodman-Kruskal, and three distance measures, namely Euclidean, Manhattan, and Minkowski distances simultaneously.


2016 ◽  
Vol 14 (1) ◽  
pp. 117-126 ◽  
Author(s):  
Kgwadi M. Mampana ◽  
Solly M. Seeletse ◽  
Enoch M. Sithole

Some problems cannot be solved optimally and compromises become necessary. In some cases obtaining an optimal solution may require combining algorithms and iterations. This often occurs when the problem is complex and a single procedure does not reach optimality. This paper shows a conglomerate of algorithms iterated in tasks to form an optimal consortium using cluster analysis. Hierarchical methods and distance measures lead the process. Few companies are desirable in optimal consortium formation. However, this study shows that optimization cannot be predetermined based on a specific fixed number of companies. The experiential exercise forms an optimal consortium of four companies from six shortlisted competitors


2017 ◽  
Vol 12 (1) ◽  
pp. 014007 ◽  
Author(s):  
Tyler J Tran ◽  
Jamis M Bruening ◽  
Andrew G Bunn ◽  
Matthew W Salzer ◽  
Stuart B Weiss

2015 ◽  
Vol 45 (3) ◽  
pp. 343-352 ◽  
Author(s):  
A. Nicault ◽  
E. Boucher ◽  
D. Tapsoba ◽  
D. Arseneault ◽  
F. Berninger ◽  
...  

The aim of this study is to analyze the relationships between black spruce (Picea mariana (Mill.) B.S.P.) growth and climate at a large spatial scale in North America’s northeastern boreal forest. The study area (approximately 700 000 km2) is located in the taiga zone of the Quebec – Labrador Peninsula. A network of tree-ring chronologies from 93 black spruce populations was developed. A hierarchical cluster analysis was conducted to analyze tree-ring series affinities, and response functions were calculated to analyze relationships between tree rings and climate. The cluster analysis results showed well-marked spatial affinities among the tree-ring series. These affinities were strongly linked with the spatial variability of the relationships between tree rings and climate. The interannual growth variations were governed mainly by the temperature variables that preceded the growing season (November (negative influence), December–January (positive influence), and April (positive influence)). The growing-season temperature (July temperature) mainly influenced the northernmost populations. Relationships between tree rings and climate in the northeastern boreal forest varied at a large spatial scale. This variability was expressed by a north–south contrast, which appears to be related to a temperature gradient, and an east–west contrast linked to a humidity gradient, which favors winter snow cover.


2020 ◽  
Vol 17 (1) ◽  
Author(s):  
Jana Cibulková ◽  
Zdenek Šulc ◽  
Hana Řezanková ◽  
Sergej Sirota

The paper focuses on similarity and distance measures for binary data and their application in cluster analysis. There are 66 measures for binary data analyzed in the paper in order to provide a comprehensive insight into the problematics and to create their well-arranged overview. For this purpose, formulas by which they were defined are studied. In the next part of the research, the results of object clustering on generated datasets are compared, and the ability of measures to create similar or identical clustering solutions is evaluated. This is done by using chosen internal and external evaluation criteria, and comparing the assignments of objects into clusters in the process of hierarchical clustering. The paper shows which similarity measures and distance measures for binary data lead to similar or even identical results in hierarchical cluster analysis.


Sign in / Sign up

Export Citation Format

Share Document