scholarly journals A Fast Density Peak Clustering Method with Autoselect Cluster Centers

2022 ◽  
Vol 2022 ◽  
pp. 1-13
Author(s):  
Zhihe Wang ◽  
Yongbiao Li ◽  
Hui Du ◽  
Xiaofen Wei

Aiming at density peaks clustering needs to manually select cluster centers, this paper proposes a fast new clustering method with auto-select cluster centers. Firstly, our method groups the data and marks each group as core or boundary groups according to its density. Secondly, it determines clusters by iteratively merging two core groups whose distance is less than the threshold and selects the cluster centers at the densest position in each cluster. Finally, it assigns boundary groups to the cluster corresponding to the nearest cluster center. Our method eliminates the need for the manual selection of cluster centers and improves clustering efficiency with the experimental results.

Author(s):  
Jianhua Jiang ◽  
Wei Zhou ◽  
Limin Wang ◽  
Xin Tao ◽  
Keqin Li

The density peaks clustering (DPC) is known as an excellent approach to detect some complicated-shaped clusters with high-dimensionality. However, it is not able to detect outliers, hub nodes and boundary nodes, or form low-density clusters. Therefore, halo is adopted to improve the performance of DPC in processing low-density nodes. This paper explores the potential reasons for adopting halos instead of low-density nodes, and proposes an improved recognition method on Halo node for Density Peak Clustering algorithm (HaloDPC). The proposed HaloDPC has improved the ability to deal with varying densities, irregular shapes, the number of clusters, outlier and hub node detection. This paper presents the advantages of the HaloDPC algorithm on several test cases.


2021 ◽  
Vol 10 (9) ◽  
pp. 589
Author(s):  
Zhicheng Shi ◽  
Ding Ma ◽  
Xue Yan ◽  
Wei Zhu ◽  
Zhigang Zhao

Clustering methods in data mining are widely used to detect hotspots in many domains. They play an increasingly important role in the era of big data. As an advanced algorithm, the density peak clustering (DPC) algorithm is able to deal with arbitrary datasets, although it does not perform well when the dataset includes multiple densities. The parameter selection of cut-off distance dc is normally determined by users’ experience and could affect clustering result. In this study, a density-peak-based clustering method is proposed to detect clusters from datasets with multiple densities and shapes. Two improvements are made regarding the limitations of existing clustering methods. First, DPC finds it difficult to detect clusters in a dataset with multiple densities. Each cluster has a unique shape and the interior includes different densities. This method adopts a step by step merging approach to solve the problem. Second, high densities of points can automatically be selected without manual participation, which is more efficient than the existing methods, which require user-specified parameters. According to experimental results, the clustering method can be applied to various datasets and performs better than traditional methods and DPC.


2020 ◽  
Vol 2020 ◽  
pp. 1-13
Author(s):  
Ziqi Jia ◽  
Ling Song

The k-prototypes algorithm is a hybrid clustering algorithm that can process Categorical Data and Numerical Data. In this study, the method of initial Cluster Center selection was improved and a new Hybrid Dissimilarity Coefficient was proposed. Based on the proposed Hybrid Dissimilarity Coefficient, a weighted k-prototype clustering algorithm based on the hybrid dissimilarity coefficient was proposed (WKPCA). The proposed WKPCA algorithm not only improves the selection of initial Cluster Centers, but also puts a new method to calculate the dissimilarity between data objects and Cluster Centers. The real dataset of UCI was used to test the WKPCA algorithm. Experimental results show that WKPCA algorithm is more efficient and robust than other k-prototypes algorithms.


2019 ◽  
Vol 84 (1) ◽  
pp. 9-20
Author(s):  
Xuhui Zhu ◽  
Junliang Shang ◽  
Yan Sun ◽  
Feng Li ◽  
Jin-Xing Liu ◽  
...  

Symmetry ◽  
2020 ◽  
Vol 12 (7) ◽  
pp. 1168
Author(s):  
Jun-Lin Lin ◽  
Jen-Chieh Kuo ◽  
Hsing-Wang Chuang

Density peak clustering (DPC) is a density-based clustering method that has attracted much attention in the academic community. DPC works by first searching density peaks in the dataset, and then assigning each data point to the same cluster as its nearest higher-density point. One problem with DPC is the determination of the density peaks, where poor selection of the density peaks could yield poor clustering results. Another problem with DPC is its cluster assignment strategy, which often makes incorrect cluster assignments for data points that are far from their nearest higher-density points. This study modifies DPC and proposes a new clustering algorithm to resolve the above problems. The proposed algorithm uses the radius of the neighborhood to automatically select a set of the likely density peaks, which are far from their nearest higher-density points. Using the potential density peaks as the density peaks, it then applies DPC to yield the preliminary clustering results. Finally, it uses single-linkage clustering on the preliminary clustering results to reduce the number of clusters, if necessary. The proposed algorithm avoids the cluster assignment problem in DPC because the cluster assignments for the potential density peaks are based on single-linkage clustering, not based on DPC. Our performance study shows that the proposed algorithm outperforms DPC for datasets with irregularly shaped clusters.


2018 ◽  
Vol 246 ◽  
pp. 03006
Author(s):  
Yanke Wang ◽  
Qidan Zhu ◽  
Wenchang Nie ◽  
Hong Xiao

Most existing clustering algorithms suffer from the computation of similarity function and the representation of each object. In this paper, we propose a clustering tracker based on region proposal network (RPN-C) to do tracking by clustering anchors output by region proposal network into potential centers. We first cut off the second part of Faster RCNN and then cast clustering algorithms in feature space of anchors, including K-Means, mean shift and density peak clustering strategy in terms of anchors’ centroid and scale information. Without fully connected layers, the RPN-C tracker can lower the computational cost up to 60% and still, it can effectively maintain an accurate prediction for the localization in next frame. To evaluate the robustness of this tracker, we establish a dataset containing over 2000 training images and 7 testing sequences of 8 kinds of fruits. The experimental results on our own datasets demonstrate that the proposed tracker performs excellently both in location of object and the decision of scale and has a strong advantage of stability in the context of occlusion and complicated background.


2021 ◽  
Author(s):  
Hui Ma ◽  
Ruiqin Wang ◽  
Shuai Yang

Abstract Clustering by fast search and find of Density Peaks (DPC) has the advantages of being simple, efficient, and capable of detecting arbitrary shapes, etc. However, there are still some shortcomings: 1) the cutoff distance is specified in advance, and the selection of local density formula will affect the final clustering effect; 2) after the cluster centers are found, the assignment strategy of the remaining points may produce “Domino effect”, that is, once a point is misallocated, more points may be misallocated subsequently. To overcome these shortcomings, we propose a density peaks clustering algorithm based on natural nearest neighbor and multi-cluster mergers. In this algorithm, a weighted local density calculation method is designed by the natural nearest neighbor, which avoids the selection of cutoff distance and the selection of the local density formula. This algorithm uses a new two-stage assignment strategy to assign the remaining points to the most suitable clusters, thus reducing assignment errors. The experiment was carried out on some artificial and real-world datasets. The experimental results show that the clustering effect of this algorithm is better than those other related algorithms.


Sign in / Sign up

Export Citation Format

Share Document