scholarly journals Validation of cluster analysis results on validation data: A systematic framework

Author(s):  
Theresa Ullmann ◽  
Christian Hennig ◽  
Anne‐Laure Boulesteix
2017 ◽  
Vol 13 (2) ◽  
pp. 1-12 ◽  
Author(s):  
Jungmok Ma

One of major obstacles in the application of the k-means clustering algorithm is the selection of the number of clusters k. The multi-attribute utility theory (MAUT)-based k-means clustering algorithm is proposed to tackle the problem by incorporating user preferences. Using MAUT, the decision maker's value structure for the number of clusters and other attributes can be quantitatively modeled, and it can be used as an objective function of the k-means. A target clustering problem for military targeting process is used to demonstrate the MAUT-based k-means and provide a comparative study. The result shows that the existing clustering algorithms do not necessarily reflect user preferences while the MAUT-based k-means provides a systematic framework of preferences modeling in cluster analysis.


2020 ◽  
Author(s):  
Zhifeng Wu ◽  
Qifei Zhang ◽  
Yinbiao Chen ◽  
Paolo Tarolli

<p>Under the combined effects of climate change and rapid urbanization, the low-lying coastal cities are vulnerable to urban waterlogging events. Urban waterlogging refers to the accumulated water disaster caused by the rainwater unable to be discharged through the drainage system in time, which affected by natural conditions and human activities. Due to the spatial heterogeneity of urban landscape and the non-linear interaction between influencing factors, in this work we proposes a novel approach to characterize the urban waterlogging variation in highly urbanized areas by implementing watershed-based Stepwise Cluster Analysis Model (SCAM), which with consideration of both natural and anthropogenic variables (i.e. topographic factors, cumulated precipitation, land surface characteristics, drainage density, and GDP). The watershed-based stepwise cluster analysis model is based on the theory of multivariate analysis of variance that can effectively capture the non-stationary and complex relationship between urban waterlogging and natural and anthropogenic factors. The watershed-based analysis can overcome the shortcomings of the negative sample selection method employed in previous studies, which greatly improve the model reliability and accuracy. Furthermore, different land-use (the proportion of impervious surfaces remains unchanged, increasing by 5% and 10%) and rainfall scenarios (accumulated precipitation increases by 5%, 10%, 20%, and 50%) are adopted to simulate the waterlogging density variation and thus to clarify the future urban waterlogging-prone areas. We consider waterlogging events in the highly urbanized coastal city - central urban districts of Guangzhou (China) from 2009 to 2015 as a case study. The results demonstrate that: (1) the SCAM performs a high degree of fitting and prediction capabilities both in the calibration and validation data sets, illustrating that it can successfully be used to reveal the complex mechanisms linking urban waterlogging to natural and anthropogenic factors; (2) The SCAM provides more accurate and detailed simulated results than other machine learning models (LR, ANN, SVM), which more realistic and detailed reflect the occurrence and distribution of urban waterlogging events; (3) Under different urbanization scenarios and precipitation scenarios, urban waterlogging density and urban waterlogging-prone areas present great variations, and thus strategies should be developed to cope with different future scenarios. Although heavy precipitation can increase the occurrence of urban waterlogging, the urban expansion characterized by the increase of impervious surface abundance was the dominant cause of urban waterlogging in the analyzed study area. This study extended our scientific understanding with theoretical and practical references to develop waterlogging management strategies and promote the further application of the stepwise cluster analysis model in the assessment and simulation of urban waterlogging variation.</p>


Author(s):  
Thomas W. Shattuck ◽  
James R. Anderson ◽  
Neil W. Tindale ◽  
Peter R. Buseck

Individual particle analysis involves the study of tens of thousands of particles using automated scanning electron microscopy and elemental analysis by energy-dispersive, x-ray emission spectroscopy (EDS). EDS produces large data sets that must be analyzed using multi-variate statistical techniques. A complete study uses cluster analysis, discriminant analysis, and factor or principal components analysis (PCA). The three techniques are used in the study of particles sampled during the FeLine cruise to the mid-Pacific ocean in the summer of 1990. The mid-Pacific aerosol provides information on long range particle transport, iron deposition, sea salt ageing, and halogen chemistry.Aerosol particle data sets suffer from a number of difficulties for pattern recognition using cluster analysis. There is a great disparity in the number of observations per cluster and the range of the variables in each cluster. The variables are not normally distributed, they are subject to considerable experimental error, and many values are zero, because of finite detection limits. Many of the clusters show considerable overlap, because of natural variability, agglomeration, and chemical reactivity.


Author(s):  
Matthew L. Hall ◽  
Stephanie De Anda

Purpose The purposes of this study were (a) to introduce “language access profiles” as a viable alternative construct to “communication mode” for describing experience with language input during early childhood for deaf and hard-of-hearing (DHH) children; (b) to describe the development of a new tool for measuring DHH children's language access profiles during infancy and toddlerhood; and (c) to evaluate the novelty, reliability, and validity of this tool. Method We adapted an existing retrospective parent report measure of early language experience (the Language Exposure Assessment Tool) to make it suitable for use with DHH populations. We administered the adapted instrument (DHH Language Exposure Assessment Tool [D-LEAT]) to the caregivers of 105 DHH children aged 12 years and younger. To measure convergent validity, we also administered another novel instrument: the Language Access Profile Tool. To measure test–retest reliability, half of the participants were interviewed again after 1 month. We identified groups of children with similar language access profiles by using hierarchical cluster analysis. Results The D-LEAT revealed DHH children's diverse experiences with access to language during infancy and toddlerhood. Cluster analysis groupings were markedly different from those derived from more traditional grouping rules (e.g., communication modes). Test–retest reliability was good, especially for the same-interviewer condition. Content, convergent, and face validity were strong. Conclusions To optimize DHH children's developmental potential, stakeholders who work at the individual and population levels would benefit from replacing communication mode with language access profiles. The D-LEAT is the first tool that aims to measure this novel construct. Despite limitations that future work aims to address, the present results demonstrate that the D-LEAT represents progress over the status quo.


2001 ◽  
Vol 60 (2) ◽  
pp. 89-98 ◽  
Author(s):  
Alain Clémence ◽  
Thierry Devos ◽  
Willem Doise

Social representations of human rights violations were investigated in a questionnaire study conducted in five countries (Costa Rica, France, Italy, Romania, and Switzerland) (N = 1239 young people). We were able to show that respondents organize their understanding of human rights violations in similar ways across nations. At the same time, systematic variations characterized opinions about human rights violations, and the structure of these variations was similar across national contexts. Differences in definitions of human rights violations were identified by a cluster analysis. A broader definition was related to critical attitudes toward governmental and institutional abuses of power, whereas a more restricted definition was rooted in a fatalistic conception of social reality, approval of social regulations, and greater tolerance for institutional infringements of privacy. An atypical definition was anchored either in a strong rejection of social regulations or in a strong condemnation of immoral individual actions linked with a high tolerance for governmental interference. These findings support the idea that contrasting definitions of human rights coexist and that these definitions are underpinned by a set of beliefs regarding the relationships between individuals and institutions.


2016 ◽  
Vol 37 (4) ◽  
pp. 250-259 ◽  
Author(s):  
Cara A. Palmer ◽  
Meagan A. Ramsey ◽  
Jennifer N. Morey ◽  
Amy L. Gentzler

Abstract. Research suggests that sharing positive events with others is beneficial for well-being, yet little is known about how positive events are shared with others and who is most likely to share their positive events. The current study expanded on previous research by investigating how positive events are shared and individual differences in how people share these events. Participants (N = 251) reported on their likelihood to share positive events in three ways: capitalizing (sharing with close others), bragging (sharing with someone who may become jealous or upset), and mass-sharing (sharing with many people at once using communication technology) across a range of positive scenarios. Using cluster analysis, five meaningful profiles of sharing patterns emerged. These profiles were associated with gender, Big Five personality traits, narcissism, and empathy. Individuals who tended to brag when they shared their positive events were more likely to be men, reported less agreeableness, less conscientiousness, and less empathy, whereas those who tended to brag and mass-share reported the highest levels of narcissism. These results have important theoretical and practical implications for the growing body of research on sharing positive events.


2019 ◽  
Vol 13 (2) ◽  
pp. 144-152 ◽  
Author(s):  
Roni Reiter-Palmon ◽  
Boris Forthmann ◽  
Baptiste Barbot

Sign in / Sign up

Export Citation Format

Share Document