hierarchical partitioning
Recently Published Documents


TOTAL DOCUMENTS

70
(FIVE YEARS 11)

H-INDEX

16
(FIVE YEARS 1)

2021 ◽  
Vol 20 ◽  
pp. 177-184
Author(s):  
Ozer Ozdemir ◽  
Simgenur Cerman

In data mining, one of the commonly-used techniques is the clustering. Clustering can be done by the different algorithms such as hierarchical, partitioning, grid, density and graph based algorithms. In this study first of all the concept of data mining explained, then giving information the aims of using data mining and the areas of using and then clustering and clustering algorithms that used in data mining are explained theoretically. Ultimately within the scope of this study, "Mall Customers" data set that taken from Kaggle database, based partitioned clustering and hierarchical clustering algorithms aimed at the separation of clusters according to their costumers features. In the clusters obtained by the partitional clustering algorithms, the similarity within the cluster is maximum and the similarity between the clusters is minimum. The hierarchical clustering algorithms is based on the gathering of similar features or vice versa. The partitional clustering algorithms used; k-means and PAM, hierarchical clustering algorithms used; AGNES and DIANA are algorithms. In this study, R statistical programming language was used in the application of algorithms. At the end of the study, the data set was run with clustering algorithms and the obtained analysis results were interpreted.


Diversity ◽  
2021 ◽  
Vol 13 (12) ◽  
pp. 672
Author(s):  
Evan A. Newman ◽  
R. Edward DeWalt ◽  
Scott A. Grubbs

Plecoptera, an environmentally sensitive order of aquatic insects commonly used in water quality monitoring is experiencing decline across the globe. This study addresses the landscape factors that impact the species richness of stoneflies using the US Geological Survey Hierarchical Unit Code 8 drainage scale (HUC8) in the state of Indiana. Over 6300 specimen records from regional museums, literature, and recent efforts were assigned to HUC8 drainages. A total of 93 species were recorded from the state. The three richest of 38 HUC8s were the Lower East Fork White (66 species), the Blue-Sinking (58), and the Lower White (51) drainages, all concentrated in the southern unglaciated part of the state. Richness was predicted using nine variables, reduced from 116 and subjected to AICc importance and hierarchical partitioning. AICc importance revealed four variables associated with Plecoptera species richness, topographic wetness index, HUC8 area, % soil hydrolgroup C/D, and % historic wetland ecosystem. Hierarchical partitioning indicated topographic wetness index, HUC8 area, and % cherty red clay surface geology as significantly important to predicting species richness. This analysis highlights the importance of hydrology and glacial history in species richness of Plecoptera. The accumulated data are primed to be used for monograph production, niche modeling, and conservation status assessment for an entire assemblage in a large geographic area.


2021 ◽  
Vol 2 (3) ◽  
pp. 1-31
Author(s):  
Liam Steadman ◽  
Nathan Griffiths ◽  
Stephen Jarvis ◽  
Mark Bell ◽  
Shaun Helman ◽  
...  

Analysing and learning from spatio-temporal datasets is an important process in many domains, including transportation, healthcare and meteorology. In particular, data collected by sensors in the environment allows us to understand and model the processes acting within the environment. Recently, the volume of spatio-temporal data collected has increased significantly, presenting several challenges for data scientists. Methods are therefore needed to reduce the quantity of data that needs to be processed in order to analyse and learn from spatio-temporal datasets. In this article, we present the - Dimensional Spatio-Temporal Reduction method ( D-STR ) for reducing the quantity of data used to store a dataset whilst enabling multiple types of analysis on the reduced dataset. D-STR uses hierarchical partitioning to find spatio-temporal regions of similar instances, and models the instances within each region to summarise the dataset. We demonstrate the generality of D-STR with three datasets exhibiting different spatio-temporal characteristics and present results for a range of data modelling techniques. Finally, we compare D-STR with other techniques for reducing the volume of spatio-temporal data. Our results demonstrate that D-STR is effective in reducing spatio-temporal data and generalises to datasets that exhibit different properties.


2021 ◽  
Author(s):  
Jiangshan Lai ◽  
Yi Zou ◽  
Jinlong Zhang ◽  
Pedro Peres-Neto

SummaryCanonical analysis, a generalization of multiple regression to multiple response variables, is widely used in ecology. Because these models often involve large amounts of parameters (one slope per response per predictor), they pose challenges to model interpretation. Currently, multi-response canonical analysis is constrained by two major challenges. Firstly, we lack quantitative frameworks for estimating the overall importance of single predictors. Secondly, although the commonly used variation partitioning framework to estimate the importance of groups of multiple predictors can be used to estimate the importance of single predictors, it is currently computationally constrained to a maximum of four predictor matrices.We established that commonality analysis and hierarchical partitioning, widely used for both estimating predictor importance and improving the interpretation of single-response regression models, are related and complementary frameworks that can be expanded for the analysis of multiple-response models.In this application, we aim at: a) demonstrating the mathematical links between commonality analysis, variation and hierarchical partitioning; b) generalizing these frameworks to allow the analysis of any number of responses, predictor variables or groups of predictor variables in the case of variation partitioning; and c) introducing and demonstrating the usage of the R package rdacca.hp that implements these generalized frameworks.


2020 ◽  
Author(s):  
Yoshitake Takada ◽  
Motoharu Uchida ◽  
Naoaki Tezuka ◽  
Mutsumi Tsujino ◽  
Shuhei Sawayama ◽  
...  

2020 ◽  
Vol 30 (3) ◽  
pp. 835-849 ◽  
Author(s):  
Shampa Shahriyar ◽  
Manzur Murshed ◽  
Mortuza Ali ◽  
Manoranjan Paul

2019 ◽  
Vol 148 ◽  
pp. 26-38 ◽  
Author(s):  
Ellen Martins Camara ◽  
Márcia Cristina Costa de Azevedo ◽  
Taynara Pontes Franco ◽  
Francisco Gerson Araújo

Sign in / Sign up

Export Citation Format

Share Document