Improving Density Peak Clustering by Automatic Peak Selection and Single Linkage Clustering

Density peak clustering (DPC) is a density-based clustering method that has attracted much attention in the academic community. DPC works by first searching density peaks in the dataset, and then assigning each data point to the same cluster as its nearest higher-density point. One problem with DPC is the determination of the density peaks, where poor selection of the density peaks could yield poor clustering results. Another problem with DPC is its cluster assignment strategy, which often makes incorrect cluster assignments for data points that are far from their nearest higher-density points. This study modifies DPC and proposes a new clustering algorithm to resolve the above problems. The proposed algorithm uses the radius of the neighborhood to automatically select a set of the likely density peaks, which are far from their nearest higher-density points. Using the potential density peaks as the density peaks, it then applies DPC to yield the preliminary clustering results. Finally, it uses single-linkage clustering on the preliminary clustering results to reduce the number of clusters, if necessary. The proposed algorithm avoids the cluster assignment problem in DPC because the cluster assignments for the potential density peaks are based on single-linkage clustering, not based on DPC. Our performance study shows that the proposed algorithm outperforms DPC for datasets with irregularly shaped clusters.

Download Full-text

HaloDPC: An Improved Recognition Method on Halo Node for Density Peak Clustering Algorithm

International Journal of Pattern Recognition and Artificial Intelligence ◽

10.1142/s0218001419500125 ◽

2019 ◽

Vol 33 (08) ◽

pp. 1950012 ◽

Cited By ~ 4

Author(s):

Jianhua Jiang ◽

Wei Zhou ◽

Limin Wang ◽

Xin Tao ◽

Keqin Li

Keyword(s):

Clustering Algorithm ◽

High Dimensionality ◽

Test Cases ◽

Low Density ◽

Recognition Method ◽

Density Peak ◽

Irregular Shapes ◽

Density Peaks ◽

Density Peaks Clustering ◽

Density Peak Clustering

The density peaks clustering (DPC) is known as an excellent approach to detect some complicated-shaped clusters with high-dimensionality. However, it is not able to detect outliers, hub nodes and boundary nodes, or form low-density clusters. Therefore, halo is adopted to improve the performance of DPC in processing low-density nodes. This paper explores the potential reasons for adopting halos instead of low-density nodes, and proposes an improved recognition method on Halo node for Density Peak Clustering algorithm (HaloDPC). The proposed HaloDPC has improved the ability to deal with varying densities, irregular shapes, the number of clusters, outlier and hub node detection. This paper presents the advantages of the HaloDPC algorithm on several test cases.

Download Full-text

Density Peak Clustering Based on Relative Density Optimization

Mathematical Problems in Engineering ◽

10.1155/2020/2816102 ◽

2020 ◽

Vol 2020 ◽

pp. 1-8

Author(s):

Chunzhong Li ◽

Yunong Zhang

Keyword(s):

Relative Density ◽

Clustering Algorithms ◽

Real Data ◽

Classification Problem ◽

Data Sets ◽

Density Peak ◽

Data Set ◽

Density Peaks ◽

Assignment Strategy ◽

Density Peak Clustering

Among numerous clustering algorithms, clustering by fast search and find of density peaks (DPC) is favoured because it is less affected by shapes and density structures of the data set. However, DPC still shows some limitations in clustering of data set with heterogeneity clusters and easily makes mistakes in assignment of remaining points. The new algorithm, density peak clustering based on relative density optimization (RDO-DPC), is proposed to settle these problems and try obtaining better results. With the help of neighborhood information of sample points, the proposed algorithm defines relative density of the sample data and searches and recognizes density peaks of the nonhomogeneous distribution as cluster centers. A new assignment strategy is proposed to solve the abundance classification problem. The experiments on synthetic and real data sets show good performance of the proposed algorithm.

Download Full-text

Accelerating Density Peak Clustering Algorithm

Symmetry ◽

10.3390/sym11070859 ◽

2019 ◽

Vol 11 (7) ◽

pp. 859 ◽

Cited By ~ 1

Author(s):

Lin

Keyword(s):

Clustering Algorithm ◽

Local Density ◽

Early Stage ◽

Separation Distance ◽

Density Peak ◽

Density Peaks ◽

Density Based Clustering ◽

Data Points ◽

Data Point ◽

Density Peak Clustering

The Density Peak Clustering (DPC) algorithm is a new density-based clustering method. It spends most of its execution time on calculating the local density and the separation distance for each data point in a dataset. The purpose of this study is to accelerate its computation. On average, the DPC algorithm scans half of the dataset to calculate the separation distance of each data point. We propose an approach to calculate the separation distance of a data point by scanning only the neighbors of the data point. Additionally, the purpose of the separation distance is to assist in choosing the density peaks, which are the data points with both high local density and high separation distance. We propose an approach to identify non-peak data points at an early stage to avoid calculating their separation distances. Our experimental results show that most of the data points in a dataset can benefit from the proposed approaches to accelerate the DPC algorithm.

Download Full-text

Density Peak Clustering Algorithm Considering Topological Features

Electronics ◽

10.3390/electronics9030459 ◽

2020 ◽

Vol 9 (3) ◽

pp. 459

Author(s):

Shuyi Lu ◽

Yuanjie Zheng ◽

Rong Luo ◽

Weikuan Jia ◽

Jian Lian ◽

...

Keyword(s):

Clustering Algorithm ◽

Clustering Algorithms ◽

Original Data ◽

Power Law Distribution ◽

Density Peak ◽

Topological Features ◽

Density Peaks ◽

Topological Characteristics ◽

Density Peak Clustering ◽

Clustering Data

The clustering algorithm plays an important role in data mining and image processing. The breakthrough of algorithm precision and method directly affects the direction and progress of the following research. At present, types of clustering algorithms are mainly divided into hierarchical, density-based, grid-based and model-based ones. This paper mainly studies the Clustering by Fast Search and Find of Density Peaks (CFSFDP) algorithm, which is a new clustering method based on density. The algorithm has the characteristics of no iterative process, few parameters and high precision. However, we found that the clustering algorithm did not consider the original topological characteristics of the data. We also found that the clustering data is similar to the social network nodes mentioned in DeepWalk, which satisfied power-law distribution. In this study, we tried to consider the topological characteristics of the graph in the clustering algorithm. Based on previous studies, we propose a clustering algorithm that adds the topological characteristics of original data on the basis of the CFSFDP algorithm. Our experimental results show that the clustering algorithm with topological features significantly improves the clustering effect and proves that the addition of topological features is effective and feasible.

Download Full-text

Density Peak Clustering algorithm using knowledge learning-based fruit fly optimization

International Journal of Computers and Applications ◽

10.1080/1206212x.2018.1440340 ◽

2018 ◽

Vol 40 (3) ◽

pp. 1-10

Author(s):

Ruihong Zhou ◽

Qiaoming Liu ◽

Xuming Han ◽

Limin Wang

Keyword(s):

Clustering Algorithm ◽

Fruit Fly ◽

Density Peak ◽

Fruit Fly Optimization ◽

Density Peak Clustering ◽

Knowledge Learning

Download Full-text

A Fast Density Peak Clustering Algorithm Optimized by Uncertain Number Neighbors for Breast MR Image

Journal of Physics Conference Series ◽

10.1088/1742-6596/1229/1/012024 ◽

2019 ◽

Vol 1229 ◽

pp. 012024 ◽

Cited By ~ 1

Author(s):

Fan Hong ◽

Yang Jing ◽

Hou Cun-cun ◽

Zhang Ke-zhen ◽

Yao Ruo-xia

Keyword(s):

Clustering Algorithm ◽

Mr Image ◽

Density Peak ◽

Breast Mr ◽

Density Peak Clustering

Download Full-text

Nearest-Neighbour-Induced Isolation Similarity and Its Impact on Density-Based Clustering

Proceedings of the AAAI Conference on Artificial Intelligence ◽

10.1609/aaai.v33i01.33014755 ◽

2019 ◽

Vol 33 ◽

pp. 4755-4762 ◽

Cited By ~ 3

Author(s):

Xiaoyu Qin ◽

Kai Ming Ting ◽

Ye Zhu ◽

Vincent CS Lee

Keyword(s):

Clustering Algorithm ◽

Distance Measure ◽

Nearest Neighbour ◽

Density Peak ◽

Density Based Clustering ◽

New Type ◽

Density Peak Clustering ◽

The Impact ◽

First Time ◽

Tree Method

A recent proposal of data dependent similarity called Isolation Kernel/Similarity has enabled SVM to produce better classification accuracy. We identify shortcomings of using a tree method to implement Isolation Similarity; and propose a nearest neighbour method instead. We formally prove the characteristic of Isolation Similarity with the use of the proposed method. The impact of Isolation Similarity on densitybased clustering is studied here. We show for the first time that the clustering performance of the classic density-based clustering algorithm DBSCAN can be significantly uplifted to surpass that of the recent density-peak clustering algorithm DP. This is achieved by simply replacing the distance measure with the proposed nearest-neighbour-induced Isolation Similarity in DBSCAN, leaving the rest of the procedure unchanged. A new type of clusters called mass-connected clusters is formally defined. We show that DBSCAN, which detects density-connected clusters, becomes one which detects mass-connected clusters, when the distance measure is replaced with the proposed similarity. We also provide the condition under which mass-connected clusters can be detected, while density-connected clusters cannot.

Download Full-text

A privacy‐preserving density peak clustering algorithm in cloud computing

Concurrency and Computation Practice and Experience ◽

10.1002/cpe.5641 ◽

2020 ◽

Vol 32 (11) ◽

Cited By ~ 1

Author(s):

Liping Sun ◽

Shang Ci ◽

Xiaoqing Liu ◽

Xiaoyao Zheng ◽

Qingying Yu ◽

...

Keyword(s):

Cloud Computing ◽

Clustering Algorithm ◽

Privacy Preserving ◽

Density Peak ◽

Density Peak Clustering

Download Full-text

VDPC: Variational Density Peak Clustering Algorithm

10.36227/techrxiv.17597669.v1 ◽

2021 ◽

Author(s):

Yizhang Wang ◽

Di Wang ◽

You Zhou ◽

Chai Quek ◽

Xiaofeng Zhang

Keyword(s):

Clustering Algorithm ◽

Cluster Formation ◽

Clustering Algorithms ◽

Data Distribution ◽

Distribution Patterns ◽

Clustering Methods ◽

Density Peak ◽

Global Parameter ◽

Density Peak Clustering ◽

Parameter Values

<div>Clustering is an important unsupervised knowledge acquisition method, which divides the unlabeled data into different groups \cite{atilgan2021efficient,d2021automatic}. Different clustering algorithms make different assumptions on the cluster formation, thus, most clustering algorithms are able to well handle at least one particular type of data distribution but may not well handle the other types of distributions. For example, K-means identifies convex clusters well \cite{bai2017fast}, and DBSCAN is able to find clusters with similar densities \cite{DBSCAN}. </div><div>Therefore, most clustering methods may not work well on data distribution patterns that are different from the assumptions being made and on a mixture of different distribution patterns. Taking DBSCAN as an example, it is sensitive to the loosely connected points between dense natural clusters as illustrated in Figure~\ref{figconnect}. The density of the connected points shown in Figure~\ref{figconnect} is different from the natural clusters on both ends, however, DBSCAN with fixed global parameter values may wrongly assign these connected points and consider all the data points in Figure~\ref{figconnect} as one big cluster.</div>

Download Full-text

Research on Density Peak Clustering Algorithm Based on Artificial Bee Colony Optimization

2018 1st International Conference on Engineering, Communication and Computer Sciences (ICECCS 2018) ◽

10.23977/iceccs.2018.011 ◽

2018 ◽

Keyword(s):

Artificial Bee Colony ◽

Clustering Algorithm ◽

Density Peak ◽

Artificial Bee Colony Optimization ◽

Bee Colony ◽

Bee Colony Optimization ◽

Density Peak Clustering

Download Full-text