scholarly journals A novel density peaks clustering algorithm based on K nearest neighbors with adaptive merging strategy

Author(s):  
Xiaoning Yuan ◽  
Hang Yu ◽  
Jun Liang ◽  
Bing Xu

AbstractRecently the density peaks clustering algorithm (DPC) has received a lot of attention from researchers. The DPC algorithm is able to find cluster centers and complete clustering tasks quickly. It is also suitable for different kinds of clustering tasks. However, deciding the cutoff distance $${d}_{c}$$ d c largely depends on human experience which greatly affects clustering results. In addition, the selection of cluster centers requires manual participation which affects the efficiency of the algorithm. In order to solve these problems, we propose a density peaks clustering algorithm based on K nearest neighbors with adaptive merging strategy (KNN-ADPC). A clusters merging strategy is proposed to automatically aggregate over-segmented clusters. Additionally, the K nearest neighbors are adopted to divide data points more reasonably. There is only one parameter in KNN-ADPC algorithm, and the clustering task can be conducted automatically without human involvement. The experiment results on artificial and real-world datasets prove higher accuracy of KNN-ADPC compared with DBSCAN, K-means++, DPC, and DPC-KNN.

2020 ◽  
Author(s):  
Xiaoning Yuan ◽  
Hang Yu ◽  
Jun Liang ◽  
Bing Xu

Abstract Recently the density peaks clustering algorithm (dubbed as DPC) attracts lots of attention. The DPC is able to quickly find cluster centers and complete clustering tasks. And the DPC is suitable for many clustering tasks. However, the cutoff distance 𝑑𝑑𝑐𝑐 is depends on human experience which will greatly affect the clustering results. In addition, the selection of cluster centers requires manual participation which will affect the clustering efficiency. In order to solve these problem, we propose a density peaks clustering algorithm based on K nearest neighbors with adaptive merging strategy (dubbed as KNN-ADPC). We propose a clusters merging strategy to automatically aggregate the over-segmented clusters. Additionally, the K nearest neighbors is adopted to divide points more reasonably. The KNN-ADPC only has one parameter and the clustering task can be conducted automatically without human involvement. The experiment results on artificial and real-world datasets prove the higher accuracy of KNN-ADPC compared with DBSCAN, K-means++, DPC and DPC-KNN.


2021 ◽  
Author(s):  
Hui Ma ◽  
Ruiqin Wang ◽  
Shuai Yang

Abstract Clustering by fast search and find of Density Peaks (DPC) has the advantages of being simple, efficient, and capable of detecting arbitrary shapes, etc. However, there are still some shortcomings: 1) the cutoff distance is specified in advance, and the selection of local density formula will affect the final clustering effect; 2) after the cluster centers are found, the assignment strategy of the remaining points may produce “Domino effect”, that is, once a point is misallocated, more points may be misallocated subsequently. To overcome these shortcomings, we propose a density peaks clustering algorithm based on natural nearest neighbor and multi-cluster mergers. In this algorithm, a weighted local density calculation method is designed by the natural nearest neighbor, which avoids the selection of cutoff distance and the selection of the local density formula. This algorithm uses a new two-stage assignment strategy to assign the remaining points to the most suitable clusters, thus reducing assignment errors. The experiment was carried out on some artificial and real-world datasets. The experimental results show that the clustering effect of this algorithm is better than those other related algorithms.


Complexity ◽  
2020 ◽  
Vol 2020 ◽  
pp. 1-17
Author(s):  
Qi Diao ◽  
Yaping Dai ◽  
Qichao An ◽  
Weixing Li ◽  
Xiaoxue Feng ◽  
...  

This paper presents an improved clustering algorithm for categorizing data with arbitrary shapes. Most of the conventional clustering approaches work only with round-shaped clusters. This task can be accomplished by quickly searching and finding clustering methods for density peaks (DPC), but in some cases, it is limited by density peaks and allocation strategy. To overcome these limitations, two improvements are proposed in this paper. To describe the clustering center more comprehensively, the definitions of local density and relative distance are fused with multiple distances, including K-nearest neighbors (KNN) and shared-nearest neighbors (SNN). A similarity-first search algorithm is designed to search the most matching cluster centers for noncenter points in a weighted KNN graph. Extensive comparison with several existing DPC methods, e.g., traditional DPC algorithm, density-based spatial clustering of applications with noise (DBSCAN), affinity propagation (AP), FKNN-DPC, and K-means methods, has been carried out. Experiments based on synthetic data and real data show that the proposed clustering algorithm can outperform DPC, DBSCAN, AP, and K-means in terms of the clustering accuracy (ACC), the adjusted mutual information (AMI), and the adjusted Rand index (ARI).


Sign in / Sign up

Export Citation Format

Share Document