scholarly journals VDPC: Variational Density Peak Clustering Algorithm

Author(s):  
Yizhang Wang ◽  
Di Wang ◽  
You Zhou ◽  
Chai Quek ◽  
Xiaofeng Zhang

<div>Clustering is an important unsupervised knowledge acquisition method, which divides the unlabeled data into different groups \cite{atilgan2021efficient,d2021automatic}. Different clustering algorithms make different assumptions on the cluster formation, thus, most clustering algorithms are able to well handle at least one particular type of data distribution but may not well handle the other types of distributions. For example, K-means identifies convex clusters well \cite{bai2017fast}, and DBSCAN is able to find clusters with similar densities \cite{DBSCAN}. </div><div>Therefore, most clustering methods may not work well on data distribution patterns that are different from the assumptions being made and on a mixture of different distribution patterns. Taking DBSCAN as an example, it is sensitive to the loosely connected points between dense natural clusters as illustrated in Figure~\ref{figconnect}. The density of the connected points shown in Figure~\ref{figconnect} is different from the natural clusters on both ends, however, DBSCAN with fixed global parameter values may wrongly assign these connected points and consider all the data points in Figure~\ref{figconnect} as one big cluster.</div>

2021 ◽  
Author(s):  
Yizhang Wang ◽  
Di Wang ◽  
You Zhou ◽  
Chai Quek ◽  
Xiaofeng Zhang

<div>Clustering is an important unsupervised knowledge acquisition method, which divides the unlabeled data into different groups \cite{atilgan2021efficient,d2021automatic}. Different clustering algorithms make different assumptions on the cluster formation, thus, most clustering algorithms are able to well handle at least one particular type of data distribution but may not well handle the other types of distributions. For example, K-means identifies convex clusters well \cite{bai2017fast}, and DBSCAN is able to find clusters with similar densities \cite{DBSCAN}. </div><div>Therefore, most clustering methods may not work well on data distribution patterns that are different from the assumptions being made and on a mixture of different distribution patterns. Taking DBSCAN as an example, it is sensitive to the loosely connected points between dense natural clusters as illustrated in Figure~\ref{figconnect}. The density of the connected points shown in Figure~\ref{figconnect} is different from the natural clusters on both ends, however, DBSCAN with fixed global parameter values may wrongly assign these connected points and consider all the data points in Figure~\ref{figconnect} as one big cluster.</div>


2014 ◽  
Vol 2014 ◽  
pp. 1-7 ◽  
Author(s):  
Chao Tong ◽  
Jianwei Niu ◽  
Bin Dai ◽  
Zhongyu Xie

In complex networks, cluster structure, identified by the heterogeneity of nodes, has become a common and important topological property. Network clustering methods are thus significant for the study of complex networks. Currently, many typical clustering algorithms have some weakness like inaccuracy and slow convergence. In this paper, we propose a clustering algorithm by calculating the core influence of nodes. The clustering process is a simulation of the process of cluster formation in sociology. The algorithm detects the nodes with core influence through their betweenness centrality, and builds the cluster’s core structure by discriminant functions. Next, the algorithm gets the final cluster structure after clustering the rest of the nodes in the network by optimizing method. Experiments on different datasets show that the clustering accuracy of this algorithm is superior to the classical clustering algorithm (Fast-Newman algorithm). It clusters faster and plays a positive role in revealing the real cluster structure of complex networks precisely.


2019 ◽  
Vol 2019 ◽  
pp. 1-10
Author(s):  
Yaohui Liu ◽  
Dong Liu ◽  
Fang Yu ◽  
Zhengming Ma

Clustering is widely used in data analysis, and density-based methods are developed rapidly in the recent 10 years. Although the state-of-art density peak clustering algorithms are efficient and can detect arbitrary shape clusters, they are nonsphere type of centroid-based methods essentially. In this paper, a novel local density hierarchical clustering algorithm based on reverse nearest neighbors, RNN-LDH, is proposed. By constructing and using a reverse nearest neighbor graph, the extended core regions are found out as initial clusters. Then, a new local density metric is defined to calculate the density of each object; meanwhile, the density hierarchical relationships among the objects are built according to their densities and neighbor relations. Finally, each unclustered object is classified to one of the initial clusters or noise. Results of experiments on synthetic and real data sets show that RNN-LDH outperforms the current clustering methods based on density peak or reverse nearest neighbors.


Complexity ◽  
2018 ◽  
Vol 2018 ◽  
pp. 1-8 ◽  
Author(s):  
Rong Zhou ◽  
Yong Zhang ◽  
Shengzhong Feng ◽  
Nurbol Luktarhan

Clustering aims to differentiate objects from different groups (clusters) by similarities or distances between pairs of objects. Numerous clustering algorithms have been proposed to investigate what factors constitute a cluster and how to efficiently find them. The clustering by fast search and find of density peak algorithm is proposed to intuitively determine cluster centers and assign points to corresponding partitions for complex datasets. This method incorporates simple structure due to the noniterative logic and less few parameters; however, the guidelines for parameter selection and center determination are not explicit. To tackle these problems, we propose an improved hierarchical clustering method HCDP aiming to represent the complex structure of the dataset. A k-nearest neighbor strategy is integrated to compute the local density of each point, avoiding to select the nonnecessary global parameter dc and enables cluster smoothing and condensing. In addition, a new clustering evaluation approach is also introduced to extract a “flat” and “optimal” partition solution from the structure by adaptively computing the clustering stability. The proposed approach is conducted on some applications with complex datasets, where the results demonstrate that the novel method outperforms its counterparts to a large extent.


Author(s):  
Yuchi Kanzawa ◽  

In this study, two co-clustering algorithms based on Bezdek-type fuzzification of fuzzy clustering are proposed for categorical multivariate data. The two proposed algorithms are motivated by the fact that there are only two fuzzy co-clustering methods currently available – entropy regularization and quadratic regularization – whereas there are three fuzzy clustering methods for vectorial data: entropy regularization, quadratic regularization, and Bezdek-type fuzzification. The first proposed algorithm forms the basis of the second algorithm. The first algorithm is a variant of a spherical clustering method, with the kernelization of a maximizing model of Bezdek-type fuzzy clustering with multi-medoids. By interpreting the first algorithm in this way, the second algorithm, a spectral clustering approach, is obtained. Numerical examples demonstrate that the proposed algorithms can produce satisfactory results when suitable parameter values are selected.


2019 ◽  
Vol 2019 ◽  
pp. 1-13 ◽  
Author(s):  
Zhenni Jiang ◽  
Xiyu Liu ◽  
Minghe Sun

This study proposes a novel method to calculate the density of the data points based on K-nearest neighbors and Shannon entropy. A variant of tissue-like P systems with active membranes is introduced to realize the clustering process. The new variant of tissue-like P systems can improve the efficiency of the algorithm and reduce the computation complexity. Finally, experimental results on synthetic and real-world datasets show that the new method is more effective than the other state-of-the-art clustering methods.


Symmetry ◽  
2022 ◽  
Vol 14 (1) ◽  
pp. 60
Author(s):  
Kun Gao ◽  
Hassan Ali Khan ◽  
Wenwen Qu

Density clustering has been widely used in many research disciplines to determine the structure of real-world datasets. Existing density clustering algorithms only work well on complete datasets. In real-world datasets, however, there may be missing feature values due to technical limitations. Many imputation methods used for density clustering cause the aggregation phenomenon. To solve this problem, a two-stage novel density peak clustering approach with missing features is proposed: First, the density peak clustering algorithm is used for the data with complete features, while the labeled core points that can represent the whole data distribution are used to train the classifier. Second, we calculate a symmetrical FWPD distance matrix for incomplete data points, then the incomplete data are imputed by the symmetrical FWPD distance matrix and classified by the classifier. The experimental results show that the proposed approach performs well on both synthetic datasets and real datasets.


Electronics ◽  
2020 ◽  
Vol 9 (3) ◽  
pp. 459
Author(s):  
Shuyi Lu ◽  
Yuanjie Zheng ◽  
Rong Luo ◽  
Weikuan Jia ◽  
Jian Lian ◽  
...  

The clustering algorithm plays an important role in data mining and image processing. The breakthrough of algorithm precision and method directly affects the direction and progress of the following research. At present, types of clustering algorithms are mainly divided into hierarchical, density-based, grid-based and model-based ones. This paper mainly studies the Clustering by Fast Search and Find of Density Peaks (CFSFDP) algorithm, which is a new clustering method based on density. The algorithm has the characteristics of no iterative process, few parameters and high precision. However, we found that the clustering algorithm did not consider the original topological characteristics of the data. We also found that the clustering data is similar to the social network nodes mentioned in DeepWalk, which satisfied power-law distribution. In this study, we tried to consider the topological characteristics of the graph in the clustering algorithm. Based on previous studies, we propose a clustering algorithm that adds the topological characteristics of original data on the basis of the CFSFDP algorithm. Our experimental results show that the clustering algorithm with topological features significantly improves the clustering effect and proves that the addition of topological features is effective and feasible.


2019 ◽  
Vol 1229 ◽  
pp. 012024 ◽  
Author(s):  
Fan Hong ◽  
Yang Jing ◽  
Hou Cun-cun ◽  
Zhang Ke-zhen ◽  
Yao Ruo-xia

Sign in / Sign up

Export Citation Format

Share Document