A Fast Density Peak Clustering Method with Autoselect Cluster Centers

Aiming at density peaks clustering needs to manually select cluster centers, this paper proposes a fast new clustering method with auto-select cluster centers. Firstly, our method groups the data and marks each group as core or boundary groups according to its density. Secondly, it determines clusters by iteratively merging two core groups whose distance is less than the threshold and selects the cluster centers at the densest position in each cluster. Finally, it assigns boundary groups to the cluster corresponding to the nearest cluster center. Our method eliminates the need for the manual selection of cluster centers and improves clustering efficiency with the experimental results.

Download Full-text

HaloDPC: An Improved Recognition Method on Halo Node for Density Peak Clustering Algorithm

International Journal of Pattern Recognition and Artificial Intelligence ◽

10.1142/s0218001419500125 ◽

2019 ◽

Vol 33 (08) ◽

pp. 1950012 ◽

Cited By ~ 4

Author(s):

Jianhua Jiang ◽

Wei Zhou ◽

Limin Wang ◽

Xin Tao ◽

Keqin Li

Keyword(s):

Clustering Algorithm ◽

High Dimensionality ◽

Test Cases ◽

Low Density ◽

Recognition Method ◽

Density Peak ◽

Irregular Shapes ◽

Density Peaks ◽

Density Peaks Clustering ◽

Density Peak Clustering

The density peaks clustering (DPC) is known as an excellent approach to detect some complicated-shaped clusters with high-dimensionality. However, it is not able to detect outliers, hub nodes and boundary nodes, or form low-density clusters. Therefore, halo is adopted to improve the performance of DPC in processing low-density nodes. This paper explores the potential reasons for adopting halos instead of low-density nodes, and proposes an improved recognition method on Halo node for Density Peak Clustering algorithm (HaloDPC). The proposed HaloDPC has improved the ability to deal with varying densities, irregular shapes, the number of clusters, outlier and hub node detection. This paper presents the advantages of the HaloDPC algorithm on several test cases.

Download Full-text

A Density-Peak-Based Clustering Method for Multiple Densities Dataset

ISPRS International Journal of Geo-Information ◽

10.3390/ijgi10090589 ◽

2021 ◽

Vol 10 (9) ◽

pp. 589

Author(s):

Zhicheng Shi ◽

Ding Ma ◽

Xue Yan ◽

Wei Zhu ◽

Zhigang Zhao

Keyword(s):

Data Mining ◽

Big Data ◽

Parameter Selection ◽

Clustering Methods ◽

Clustering Method ◽

Density Peak ◽

Unique Shape ◽

Density Peak Clustering ◽

Selection Of ◽

Better Than

Clustering methods in data mining are widely used to detect hotspots in many domains. They play an increasingly important role in the era of big data. As an advanced algorithm, the density peak clustering (DPC) algorithm is able to deal with arbitrary datasets, although it does not perform well when the dataset includes multiple densities. The parameter selection of cut-off distance dc is normally determined by users’ experience and could affect clustering result. In this study, a density-peak-based clustering method is proposed to detect clusters from datasets with multiple densities and shapes. Two improvements are made regarding the limitations of existing clustering methods. First, DPC finds it difficult to detect clusters in a dataset with multiple densities. Each cluster has a unique shape and the interior includes different densities. This method adopts a step by step merging approach to solve the problem. Second, high densities of points can automatically be selected without manual participation, which is more efficient than the existing methods, which require user-specified parameters. According to experimental results, the clustering method can be applied to various datasets and performs better than traditional methods and DPC.

Download Full-text

Weighted k-Prototypes Clustering Algorithm Based on the Hybrid Dissimilarity Coefficient

Mathematical Problems in Engineering ◽

10.1155/2020/5143797 ◽

2020 ◽

Vol 2020 ◽

pp. 1-13

Author(s):

Ziqi Jia ◽

Ling Song

Keyword(s):

Categorical Data ◽

Clustering Algorithm ◽

Numerical Data ◽

Experimental Results ◽

Cluster Center ◽

Real Dataset ◽

Dissimilarity Coefficient ◽

Initial Cluster ◽

Data Objects ◽

Selection Of

The k-prototypes algorithm is a hybrid clustering algorithm that can process Categorical Data and Numerical Data. In this study, the method of initial Cluster Center selection was improved and a new Hybrid Dissimilarity Coefficient was proposed. Based on the proposed Hybrid Dissimilarity Coefficient, a weighted k-prototype clustering algorithm based on the hybrid dissimilarity coefficient was proposed (WKPCA). The proposed WKPCA algorithm not only improves the selection of initial Cluster Centers, but also puts a new method to calculate the dissimilarity between data objects and Cluster Centers. The real dataset of UCI was used to test the WKPCA algorithm. Experimental results show that WKPCA algorithm is more efficient and robust than other k-prototypes algorithms.

Download Full-text

PSO-CFDP: A Particle Swarm Optimization-Based Automatic Density Peaks Clustering Method for Cancer Subtyping

Human Heredity ◽

10.1159/000501481 ◽

2019 ◽

Vol 84 (1) ◽

pp. 9-20

Author(s):

Xuhui Zhu ◽

Junliang Shang ◽

Yan Sun ◽

Feng Li ◽

Jin-Xing Liu ◽

...

Keyword(s):

Particle Swarm Optimization ◽

Particle Swarm ◽

Clustering Method ◽

Swarm Optimization ◽

Density Peaks ◽

Density Peaks Clustering

Download Full-text

Improving Density Peak Clustering by Automatic Peak Selection and Single Linkage Clustering

Symmetry ◽

10.3390/sym12071168 ◽

2020 ◽

Vol 12 (7) ◽

pp. 1168

Author(s):

Jun-Lin Lin ◽

Jen-Chieh Kuo ◽

Hsing-Wang Chuang

Keyword(s):

Clustering Algorithm ◽

Academic Community ◽

Performance Study ◽

Potential Density ◽

Cluster Assignment ◽

Density Peak ◽

Single Linkage ◽

Density Peaks ◽

Assignment Strategy ◽

Density Peak Clustering

Density peak clustering (DPC) is a density-based clustering method that has attracted much attention in the academic community. DPC works by first searching density peaks in the dataset, and then assigning each data point to the same cluster as its nearest higher-density point. One problem with DPC is the determination of the density peaks, where poor selection of the density peaks could yield poor clustering results. Another problem with DPC is its cluster assignment strategy, which often makes incorrect cluster assignments for data points that are far from their nearest higher-density points. This study modifies DPC and proposes a new clustering algorithm to resolve the above problems. The proposed algorithm uses the radius of the neighborhood to automatically select a set of the likely density peaks, which are far from their nearest higher-density points. Using the potential density peaks as the density peaks, it then applies DPC to yield the preliminary clustering results. Finally, it uses single-linkage clustering on the preliminary clustering results to reduce the number of clusters, if necessary. The proposed algorithm avoids the cluster assignment problem in DPC because the cluster assignments for the potential density peaks are based on single-linkage clustering, not based on DPC. Our performance study shows that the proposed algorithm outperforms DPC for datasets with irregularly shaped clusters.

Download Full-text

Subordinate based Cluster Center Identification in Density Peak Clustering

2018 IEEE 7th Data Driven Control and Learning Systems Conference (DDCLS) ◽

10.1109/ddcls.2018.8516003 ◽

2018 ◽

Author(s):

Jian Hou ◽

Aihua Zhang ◽

Lv Chengcong ◽

E Xu

Keyword(s):

Cluster Center ◽

Density Peak ◽

Density Peak Clustering

Download Full-text

Do tracking by clustering anchors output from region proposal network

MATEC Web of Conferences ◽

10.1051/matecconf/201824603006 ◽

2018 ◽

Vol 246 ◽

pp. 03006

Author(s):

Yanke Wang ◽

Qidan Zhu ◽

Wenchang Nie ◽

Hong Xiao

Keyword(s):

Clustering Algorithms ◽

Mean Shift ◽

Computational Cost ◽

Feature Space ◽

Experimental Results ◽

Similarity Function ◽

Density Peak ◽

Training Images ◽

Density Peak Clustering ◽

Fully Connected

Most existing clustering algorithms suffer from the computation of similarity function and the representation of each object. In this paper, we propose a clustering tracker based on region proposal network (RPN-C) to do tracking by clustering anchors output by region proposal network into potential centers. We first cut off the second part of Faster RCNN and then cast clustering algorithms in feature space of anchors, including K-Means, mean shift and density peak clustering strategy in terms of anchors’ centroid and scale information. Without fully connected layers, the RPN-C tracker can lower the computational cost up to 60% and still, it can effectively maintain an accurate prediction for the localization in next frame. To evaluate the robustness of this tracker, we establish a dataset containing over 2000 training images and 7 testing sequences of 8 kinds of fruits. The experimental results on our own datasets demonstrate that the proposed tracker performs excellently both in location of object and the decision of scale and has a strong advantage of stability in the context of occlusion and complicated background.

Download Full-text

A Density Peaks Clustering Method Based on Mutual kNN Graph and Shortest Path

2020 28th Iranian Conference on Electrical Engineering (ICEE) ◽

10.1109/icee50131.2020.9260954 ◽

2020 ◽

Author(s):

Pooya Mehrmohammadi ◽

Mohammad Hatami ◽

Parham Moradi

Keyword(s):

Shortest Path ◽

Clustering Method ◽

Density Peaks ◽

Density Peaks Clustering

Download Full-text

Density Peaks Clustering based on Nature Nearest Neighbor and Multi-cluster Mergers

10.21203/rs.3.rs-825405/v1 ◽

2021 ◽

Author(s):

Hui Ma ◽

Ruiqin Wang ◽

Shuai Yang

Keyword(s):

Clustering Algorithm ◽

Nearest Neighbor ◽

Local Density ◽

Density Peaks ◽

Density Peaks Clustering ◽

Assignment Strategy ◽

Cutoff Distance ◽

Clustering Effect ◽

Cluster Mergers ◽

Selection Of

Abstract Clustering by fast search and find of Density Peaks (DPC) has the advantages of being simple, efficient, and capable of detecting arbitrary shapes, etc. However, there are still some shortcomings: 1) the cutoff distance is specified in advance, and the selection of local density formula will affect the final clustering effect; 2) after the cluster centers are found, the assignment strategy of the remaining points may produce “Domino effect”, that is, once a point is misallocated, more points may be misallocated subsequently. To overcome these shortcomings, we propose a density peaks clustering algorithm based on natural nearest neighbor and multi-cluster mergers. In this algorithm, a weighted local density calculation method is designed by the natural nearest neighbor, which avoids the selection of cutoff distance and the selection of the local density formula. This algorithm uses a new two-stage assignment strategy to assign the remaining points to the most suitable clusters, thus reducing assignment errors. The experiment was carried out on some artificial and real-world datasets. The experimental results show that the clustering effect of this algorithm is better than those other related algorithms.

Download Full-text

L-DP: A Hybrid Density Peaks Clustering Method

Data Mining and Big Data - Lecture Notes in Computer Science ◽

10.1007/978-3-319-61845-6_8 ◽

2017 ◽

pp. 74-80

Author(s):

Mingjing Du ◽

Shifei Ding

Keyword(s):

Clustering Method ◽

Density Peaks ◽

Density Peaks Clustering

Download Full-text