scholarly journals Clustering Mixed Data by Fast Search and Find of Density Peaks

2017 ◽  
Vol 2017 ◽  
pp. 1-7 ◽  
Author(s):  
Shihua Liu ◽  
Bingzhong Zhou ◽  
Decai Huang ◽  
Liangzhong Shen

Aiming at the mixed data composed of numerical and categorical attributes, a new unified dissimilarity metric is proposed, and based on that a new clustering algorithm is also proposed. The experiment result shows that this new method of clustering mixed data by fast search and find of density peaks is feasible and effective on the UCI datasets.

Symmetry ◽  
2019 ◽  
Vol 11 (2) ◽  
pp. 163
Author(s):  
Baobin Duan ◽  
Lixin Han ◽  
Zhinan Gou ◽  
Yi Yang ◽  
Shuangshuang Chen

With the universal existence of mixed data with numerical and categorical attributes in real world, a variety of clustering algorithms have been developed to discover the potential information hidden in mixed data. Most existing clustering algorithms often compute the distances or similarities between data objects based on original data, which may cause the instability of clustering results because of noise. In this paper, a clustering framework is proposed to explore the grouping structure of the mixed data. First, the transformed categorical attributes by one-hot encoding technique and normalized numerical attributes are input to a stacked denoising autoencoders to learn the internal feature representations. Secondly, based on these feature representations, all the distances between data objects in feature space can be calculated and the local density and relative distance of each data object can be also computed. Thirdly, the density peaks clustering algorithm is improved and employed to allocate all the data objects into different clusters. Finally, experiments conducted on some UCI datasets have demonstrated that our proposed algorithm for clustering mixed data outperforms three baseline algorithms in terms of the clustering accuracy and the rand index.


2021 ◽  
Vol 13 (15) ◽  
pp. 2894
Author(s):  
Xiang Wu ◽  
Fengyan Wang ◽  
Mingchang Wang ◽  
Xuqing Zhang ◽  
Qing Wang ◽  
...  

Light detection and ranging (LiDAR) can quickly and accurately obtain 3D point clouds on the surface of rock masses, and on the basis of this, discontinuity information can be extracted automatically. This paper proposes a new method to automatically extract discontinuity information from 3D point clouds on the surface of rock masses. This method first applies the improved K-means algorithm based on the clustering algorithm by fast search and find of density peaks (DPCA) and the silhouette coefficient in the cluster validity index to identify the discontinuity sets of rock masses, and then uses the hierarchical density-based spatial clustering of applications with noise (HDBSCAN) algorithm to segment the discontinuity sets and to extract each discontinuity from a discontinuity set. Finally, the random sampling consistency (RANSAC) method is used to fit the discontinuities and to calculate their parameters. The 3D point clouds of the typical rock slope in the Rockbench repository is used to extract the discontinuity orientations using the new method, and these are compared with the results obtained from the classical approach and the previous automatic methods. The results show that, compared to the results obtained by Riquelme et al. in 2014, the average deviation of the dip direction and dip angle is reduced by 26% and 8%, respectively; compared to the results obtained by Chen et al. in 2016, the average deviation of the dip direction and dip angle is reduced by 39% and 40%, respectively. The method is also applied to an artificial quarry slope, and the average deviation of the dip direction and dip angle is 5.3° and 4.8°, respectively, as compared to the manual method. Furthermore, the related parameters are analyzed. The study shows that the new method is reliable, has a higher precision when identifying rock mass discontinuities, and can be applied to practical engineering.


Sign in / Sign up

Export Citation Format

Share Document