scholarly journals Using hierarchical clustering to explore patterns of deprivation among English local authorities

2019 ◽  
Vol 42 (4) ◽  
pp. 772-777
Author(s):  
Steven L Senior

Abstract Background The English Indices of Multiple Deprivation (IMD) is widely used as a measure of deprivation. However, similarly ranked areas can differ substantially in the underlying domains of deprivation. These domains contain a richer set of data that might be useful for classifying local authorities. Clustering methods offer a set of techniques to identify groups of areas with similar patterns of deprivation. Methods Hierarchical agglomerative (i.e. bottom-up) clustering methods were applied to domain scores for 152 upper tier local authorities. Advances in statistical testing allow clusters to be identified that are unlikely to have arisen from random partitioning of a homogeneous group. The resulting clusters are described in terms of their subdomain scores and basic geographic and demographic characteristics. Results Five statistically significant clusters of local authorities were identified. These clusters only partially reflect different levels of overall deprivation. In particular, two clusters share similar overall IMD scores but have contrasting patterns of deprivation. Conclusion Hierarchical clustering methods identify five distinct clusters that do not correspond closely to quintiles of deprivation. This approach may help to distinguish between places that face similar underlying challenges, and places that appear similar in terms of overall deprivation scores, but that face different challenges.

2019 ◽  
Author(s):  
Steven L. Senior

ABSTRACTBackgroundThe English Indices of Multiple Deprivation (IMD) is widely used as a measure of deprivation of geographic areas in analyses of health inequalities between places. However, similarly ranked areas can differ substantially in the underlying domains and indicators that are used to calculate the IMD score. These domains and indicators contain a richer set of data that might be useful for classifying local authorities. Clustering methods offer a set of techniques to identify groups of areas with similar patterns of deprivation. This could offer insights into areas that face similar challenges.MethodsHierarchical agglomerative (i.e. bottom-up) clustering methods were applied to sub-domain scores for 152 upper-tier local authorities. Recent advances in statistical testing allow clusters to be identified that are unlikely to have arisen from random partitioning of a homogeneous group. The resulting clusters are described in terms of their subdomain scores and basic geographic and demographic characteristics.ResultsFive statistically significant clusters of local authorities were identified. These clusters represented local authorities that were:Most deprived, predominantly urban;Least deprived, predominantly rural;Less deprived, rural;Deprived, high crime, high barriers to housing; andDeprived, low education, poor employment, poor health.ConclusionHierarchical clustering methods identify five distinct clusters that do not correspond closely to quintiles of deprivation. These methods can be used to draw on the richer set of information contained in the IMD domains and may help to identify places that face similar challenges, and places that appear similar in terms of IMD scores, but that face different challenges.


2019 ◽  
Vol 488 (1) ◽  
pp. 1377-1386 ◽  
Author(s):  
V Carruba ◽  
S Aljbaae ◽  
A Lucchini

ABSTRACT Asteroid families are groups of asteroids that share a common origin. They can be the outcome of a collision or be the result of the rotational failure of a parent body or its satellites. Collisional asteroid families have been identified for several decades using hierarchical clustering methods (HCMs) in proper elements domains. In this method, the distance of an asteroid from a reference body is computed, and, if it is less than a critical value, the asteroid is added to the family list. The process is then repeated with the new object as a reference, until no new family members are found. Recently, new machine-learning clustering algorithms have been introduced for the purpose of cluster classification. Here, we apply supervised-learning hierarchical clustering algorithms for the purpose of asteroid families identification. The accuracy, precision, and recall values of results obtained with the new method, when compared with classical HCM, show that this approach is able to found family members with an accuracy above 89.5 per cent, and that all asteroid previously identified as family members by traditional methods are consistently retrieved. Values of the areas under the curve coefficients below Receiver Operating Characteristic curves are also optimal, with values consistently above 85 per cent. Overall, we identify 6 new families and 13 new clumps in regions where the method can be applied that appear to be consistent and homogeneous in terms of physical and taxonomic properties. Machine-learning clustering algorithms can, therefore, be very efficient and fast tools for the problem of asteroid family identification.


2010 ◽  
Vol 439-440 ◽  
pp. 1306-1311
Author(s):  
Fang Li ◽  
Qun Xiong Zhu

LSI based hierarchical agglomerative clustering algorithm is studied. Aiming to the problems of LSI based hierarchical agglomerative clustering method, NMF based hierarchical clustering method is proposed and analyzed. Two ways of implementing NMF based method are introduced. Finally the result of two groups of experiment based on the TanCorp document corpora show that the method proposed is effective.


2017 ◽  
Vol 14 (1) ◽  
Author(s):  
Zdeněk Šulc ◽  
Martin Matějka ◽  
Jiří Procházka ◽  
Hana Řezanková

This paper thoroughly examines three recently introduced modifications of the Gower coefficient, which were determined for data with mixed-type variables in hierarchical clustering. On the contrary to the original Gower coefficient, which only recognizes if two categories match or not in the case of nominal variables, the examined modifications offer three different approaches to measuring the similarity between categories. The examined dissimilarity measures are compared and evaluated regarding the quality of their clusters measured by three internal indices (Dunn, silhouette, McClain) and regarding their classification abilities measured by the Rand index. The comparison is performed on 810 generated datasets. In the analysis, the performance of the similarity measures is evaluated by different data characteristics (the number of variables, the number of categories, the distance of clusters, etc.) and by different hierarchical clustering methods (average, complete, McQuitty and single linkage methods). As a result, two modifications are recommended for the use in practice.


2017 ◽  
Vol 9 (2) ◽  
pp. 195-213
Author(s):  
Richárd Forster ◽  
Ágnes Fülöp

AbstractThe reconstruction and analyze of measured data play important role in the research of high energy particle physics. This leads to new results in both experimental and theoretical physics. This requires algorithm improvements and high computer capacity. Clustering algorithm makes it possible to get to know the jet structure more accurately. More granular parallelization of the kt cluster algorithms was explored by combining it with the hierarchical clustering methods used in network evaluations. The kt method allows to know the development of particles due to the collision of high-energy nucleus-nucleus. The hierarchical clustering algorithms works on graphs, so the particle information used by the standard kt algorithm was first transformed into an appropriate graph, representing the network of particles. Testing was done using data samples from the Alice offine library, which contains the required modules to simulate the ALICE detector that is a dedicated Pb-Pb detector. The proposed algorithm was compared to the FastJet toolkit's standard longitudinal invariant kt implementation. Parallelizing the standard non-optimized version of this algorithm utilizing the available CPU architecture proved to be 1:6 times faster, than the standard implementation, while the proposed solution in this paper was able to achieve a 12 times faster computing performance, also being scalable enough to efficiently run on GPUs.


2019 ◽  
Vol 15 (2) ◽  
pp. 41-52
Author(s):  
Francisco Kelsen de Oliveira ◽  
Max Brandão de Oliveira ◽  
Alex Sandro Gomes ◽  
Leandro Marques Queiros

This article contains data from a group of users, divided into subgroups according to their levels of knowledge about technology. Statistical hierarchical and non-hierarchical clustering methods were studied, compared and used in the creations of the subgroups from the similarities of the skill levels with these users' technology. The research sample consists of teachers who answered online questionnaires about their skills in the use of software and hardware with an educational bias. The statistical methods of the grouping were performed and showed the possibilities of groupings of the users. The analysis of these groups allowed the identification of the common characteristics among the individuals of each subgroup. Therefore, it was possible to define two subgroups of users, one with skills in technology and another without skills in technology. The partial results of the research showed two main algorithms for grouping with 92% similarity from groups of users with skills in technology and the other with little skill, confirming the accuracy of the techniques discriminating against individuals.


Sign in / Sign up

Export Citation Format

Share Document