hierarchical clustering algorithm
Recently Published Documents


TOTAL DOCUMENTS

288
(FIVE YEARS 80)

H-INDEX

18
(FIVE YEARS 3)

2021 ◽  
Vol 39 (4) ◽  
Author(s):  
Neyva Maria Lopes Romeiro ◽  
Mara Caroline Torres dos SANTOS ◽  
Carolina PANIS ◽  
Tiago Viana Flor de SANTANA ◽  
Paulo Laerte NATTI ◽  
...  

This work presents a cluster analysis approach aiming to determine distinct groups based on clinicopathological data from patients with breast cancer (BC). For this purpose, the clinical variables were considered: age at diagnosis, weight, height, lymph nodal invasion (LN), tumor-node-metastasis (TNM) staging and body mass index (BMI). Ward's hierarchical clustering algorithm was used to form specific groups. Based on this, BC patients were separated into four groups. The Kruskal-Wallis test was performed to assess the differences among the clusters. The intensity of the influence of variables on the prognosis of BC was also evaluated by calculating the Spearman's correlation. Positive correlations were obtained between weight and BMI, TNM and LN invasion in all analyzes. Negative correlations between BMI and height were obtained in some of the analyzes. Finally, a new correlation was obtained, based on this approach, between weight and TNM, demonstrating that the trophic-adipose status of BC patients can be directly related to disease staging.


2021 ◽  
Vol 2021 ◽  
pp. 1-10
Author(s):  
Wenbo Jiang ◽  
Mingyue Zhong

The actual operating state of the wind turbine group is influenced by the wake effect and control mode; however, the current models cannot describe the actual operating state very well. A dynamic equivalent modeling method for a doubly fed wind power generator is proposed on the basis of ensuring the accurate description of the wind turbine group. As the clustering index, dominant variables are used in the hierarchical clustering algorithm, which are extracted by principal component analysis. Three dynamic equivalent models of 24 wind turbines are established using PSCAD software platform, which use 13 state variables, wind speed, and dominant variables as clustering indexes, respectively. Furthermore, the active power and reactive power output curves of wind farm are simulated in the case of the three-phase short-circuit fault on the system side or wind speed fluctuation, respectively. The simulation results demonstrate that it is reasonable and effective to extract slip ratio and wind turbine torque as clustering index, and the maximal relative error between the dominant variable equivalent model and 13-state-variable model is only 9.9%, which is greatly lower than that of the wind speed model, K-means clustering model, neural network model, and support vector machine model. This model is easy to implement and has wider application prospect, especially for characteristics analysis of large-scale wind farm connected to power grid.


2021 ◽  
pp. 453-465
Author(s):  
Zhuxi Zhang ◽  
Yichong Chen ◽  
Jing Fang ◽  
Xueyang Zhou ◽  
Yuhang An ◽  
...  

Author(s):  
Marwan B. Mohammed ◽  
Wafaa AL-Hameed

The clustering analysis techniques play an important role in the area of data mining. Although from existence several clustering techniques. However, it still to their tries to improve the clustering process efficiently or propose new techniques seeks to allocate objects into clusters so that two objects in the same cluster are more similar than two objects in different clusters and careful not to duplicate the same objects in different groups with the ability to cover all data as much as possible. This paper presents two directions. The first is to propose a new algorithm that coined a name (MB Algorithm) to collect unlabeled data and put them into appropriate groups. The second is the creation of a lexical sequence sentence (LCS) based on similar semantic sentences which are different from the traditional lexical word chain (LCW) based on words. The results showed that the performance of the MB algorithm has generally outperformed the two algorithms the hierarchical clustering algorithm and the K-mean algorithm.


Mathematics ◽  
2021 ◽  
Vol 9 (21) ◽  
pp. 2666
Author(s):  
Daniel Gómez ◽  
Javier Castro ◽  
Inmaculada Gutiérrez ◽  
Rosa Espínola

In this paper we formally define the hierarchical clustering network problem (HCNP) as the problem to find a good hierarchical partition of a network. This new problem focuses on the dynamic process of the clustering rather than on the final picture of the clustering process. To address it, we introduce a new hierarchical clustering algorithm in networks, based on a new shortest path betweenness measure. To calculate it, the communication between each pair of nodes is weighed by the importance of the nodes that establish this communication. The weights or importance associated to each pair of nodes are calculated as the Shapley value of a game, named as the linear modularity game. This new measure, (the node-game shortest path betweenness measure), is used to obtain a hierarchical partition of the network by eliminating the link with the highest value. To evaluate the performance of our algorithm, we introduce several criteria that allow us to compare different dendrograms of a network from two point of view: modularity and homogeneity. Finally, we propose a faster algorithm based on a simplification of the node-game shortest path betweenness measure, whose order is quadratic on sparse networks. This fast version is competitive from a computational point of view with other hierarchical fast algorithms, and, in general, it provides better results.


Author(s):  
Hasih Pratiwi ◽  
Sri S. Handajani ◽  
Irwan Susanto ◽  
Senot Sangadji ◽  
Renny Meilawati ◽  
...  

2021 ◽  
Vol 8 (10) ◽  
pp. 43-50
Author(s):  
Truong et al. ◽  

Clustering is a fundamental technique in data mining and machine learning. Recently, many researchers are interested in the problem of clustering categorical data and several new approaches have been proposed. One of the successful and pioneering clustering algorithms is the Minimum-Minimum Roughness algorithm (MMR) which is a top-down hierarchical clustering algorithm and can handle the uncertainty in clustering categorical data. However, MMR tends to choose the category with less value leaf node with more objects, leading to undesirable clustering results. To overcome such shortcomings, this paper proposes an improved version of the MMR algorithm for clustering categorical data, called IMMR (Improved Minimum-Minimum Roughness). Experimental results on actual data sets taken from UCI show that the IMMR algorithm outperforms MMR in clustering categorical data.


2021 ◽  
Vol 2021 ◽  
pp. 1-15
Author(s):  
Zhenghua Hu ◽  
Kejie Huang ◽  
Enyou Zhang ◽  
Qi’ang Ge ◽  
Xiaoxue Yang

Traveling by bike-sharing systems has become an indispensable means of transportation in our daily lives because green commuting has gradually become a consensus and conscious action. However, the problem of “difficult to rent or to return a bike” has gradually become an issue in operating the bike-sharing system. Moreover, scientific and systematic schemes that can efficiently complete the task of rebalancing bike-sharing systems are lacking. This study aims to introduce the basic idea of the k-divisive hierarchical clustering algorithm. A rebalancing strategy based on the model of level of detail in combination with genetic algorithm was proposed. Data were collected from the bike-sharing system in Ningbo. Results showed that the proposed algorithm could alleviate the problem of the uneven distribution of the demand for renting or returning bikes and effectively improve the service from the bike-sharing system. Compared with the traditional method, this algorithm helps reduce the effective time for rebalancing bike-sharing systems by 28.3%. Therefore, it is an effective rebalancing scheme.


Author(s):  
Daniel Gómez ◽  
Javier Castro ◽  
Inmaculada Gutiérrez García-Pardo ◽  
Rosa Espínola

In this paper we formally define the hierarchical clustering network problem (HCNP) as the problem to find a good hierarchical partition of a network. This new problem focuses on the dynamic process of the clustering rather than on the final picture of the clustering process. To address it, we introduce a new hierarchical clustering algorithm in networks, based on a new shortest path betweenness measure. To calculate it, the communication between each pair of nodes is weighed by the importance of the nodes that establish this communication. The weights or importance associated to each pair of nodes are calculated as the Shapley value of a game, named as the linear modularity game. This new measure, (the node-game shortest path betweenness measure), is used to obtain a hierarchical partition of the network by eliminating the link with the highest value. To evaluate the performance of our algorithm, we introduce several criteria that allow us to compare different dendrograms of a network from two point of view: modularity and homogeneity. Finally, we propose a faster algorithm based on a simplification of the node-game shortest path betweenness measure, whose order is quadratic on sparse networks. This fast version is competitive from a computational point of view with other hierarchical fast algorithms, and, in general, it provides better results.


Sign in / Sign up

Export Citation Format

Share Document