density peaks
Recently Published Documents


TOTAL DOCUMENTS

410
(FIVE YEARS 63)

H-INDEX

26
(FIVE YEARS 0)

2022 ◽  
Vol 2022 ◽  
pp. 1-13
Author(s):  
Zhihe Wang ◽  
Yongbiao Li ◽  
Hui Du ◽  
Xiaofen Wei

Aiming at density peaks clustering needs to manually select cluster centers, this paper proposes a fast new clustering method with auto-select cluster centers. Firstly, our method groups the data and marks each group as core or boundary groups according to its density. Secondly, it determines clusters by iteratively merging two core groups whose distance is less than the threshold and selects the cluster centers at the densest position in each cluster. Finally, it assigns boundary groups to the cluster corresponding to the nearest cluster center. Our method eliminates the need for the manual selection of cluster centers and improves clustering efficiency with the experimental results.



2022 ◽  
Vol 2022 ◽  
pp. 1-17
Author(s):  
Zhihui Hu ◽  
Xiaoran Wei ◽  
Xiaoxu Han ◽  
Guang Kou ◽  
Haoyu Zhang ◽  
...  

Density peaks clustering (DPC) is a well-known density-based clustering algorithm that can deal with nonspherical clusters well. However, DPC has high computational complexity and space complexity in calculating local density ρ and distance δ , which makes it suitable only for small-scale data sets. In addition, for clustering high-dimensional data, the performance of DPC still needs to be improved. High-dimensional data not only make the data distribution more complex but also lead to more computational overheads. To address the above issues, we propose an improved density peaks clustering algorithm, which combines feature reduction and data sampling strategy. Specifically, features of the high-dimensional data are automatically extracted by principal component analysis (PCA), auto-encoder (AE), and t-distributed stochastic neighbor embedding (t-SNE). Next, in order to reduce the computational overhead, we propose a novel data sampling method for the low-dimensional feature data. Firstly, the data distribution in the low-dimensional feature space is estimated by the Quasi-Monte Carlo (QMC) sequence with low-discrepancy characteristics. Then, the representative QMC points are selected according to their cell densities. Next, the selected QMC points are used to calculate ρ and δ instead of the original data points. In general, the number of the selected QMC points is much smaller than that of the initial data set. Finally, a two-stage classification strategy based on the QMC points clustering results is proposed to classify the original data set. Compared with current works, our proposed algorithm can reduce the computational complexity from O n 2 to O N n , where N denotes the number of selected QMC points and n is the size of original data set, typically N ≪ n . Experimental results demonstrate that the proposed algorithm can effectively reduce the computational overhead and improve the model performance.



2022 ◽  
pp. 108123
Author(s):  
Tengfei Gao ◽  
Dan Chen ◽  
Yunbo Tang ◽  
Bo Du ◽  
Rajiv Ranjan ◽  
...  




Information ◽  
2021 ◽  
Vol 12 (12) ◽  
pp. 501
Author(s):  
Yuanyuan Meng ◽  
Xiyu Liu

Community detection is a significant research field of social networks, and modularity is a common method to measure the division of communities in social networks. Many classical algorithms obtain community partition by improving the modularity of the whole network. However, there is still a challenge in community division, which is that the traditional modularity optimization is difficult to avoid resolution limits. To a certain extent, the simple pursuit of improving modularity will cause the division to deviate from the real community structure. To overcome these defects, with the help of clustering ideas, we proposed a method to filter community centers by the relative connection coefficient between vertices, and we analyzed the community structure accordingly. We discuss how to define the relative connection coefficient between vertices, how to select the community centers, and how to divide the remaining vertices. Experiments on both real and synthetic networks demonstrated that our algorithm is effective compared with the state-of-the-art methods.



2021 ◽  
Vol 923 (2) ◽  
pp. 161
Author(s):  
Fahad Nasir ◽  
Christopher Cain ◽  
Anson D’Aloisio ◽  
Nakul Gangolli ◽  
Matthew McQuinn

Abstract Becker et al. measured the mean free path of Lyman-limit photons in the intergalactic medium (IGM) at z = 6. The short value suggests that absorptions may have played a prominent role in reionization. Here we study physical properties of ionizing photon sinks in the wake of ionization fronts (I-fronts) using radiative hydrodynamic simulations. We quantify the contributions of gaseous structures to the Lyman-limit opacity by tracking the column-density distributions in our simulations. Within Δt = 10 Myr of I-front passage, we find that self-shielding systems (N H I > 1017.2 cm−2) are comprised of two distinct populations: (1) overdensity Δ ∼ 50 structures in photoionization equilibrium with the ionizing background, and (2) Δ ≳ 100 density peaks with fully neutral cores. The self-shielding systems contribute more than half of the opacity at these times, but the IGM evolves considerably in Δt ∼ 100 Myr as structures are flattened by pressure smoothing and photoevaporation. By Δt = 300 Myr, they contribute ≲10% to the opacity in an average 1 Mpc3 patch of the universe. The percentage can be a factor of a few larger in overdense patches, where more self-shielding systems survive. We quantify the characteristic masses and sizes of self-shielding structures. Shortly after I-front passage, we find M = 104–108 M ⊙ and effective diameters d eff = 1–20 ckpc h −1. These scales increase as the gas relaxes. The picture herein presented may be different in dark matter models with suppressed small-scale power.





2021 ◽  
Author(s):  
Shuaijun Li ◽  
Jia Lu

Abstract Self-training algorithm can quickly train an supervised classifier through a few labeled samples and lots of unlabeled samples. However, self-training algorithm is often affected by mislabeled samples, and local noise filter is proposed to detect the mislabeled samples. Nevertheless, current local noise filters have the problems: (a) Current local noise filters ignore the spatial distribution of the nearest neighbors in different classes. (b) They can’t perform well when mislabeled samples are located in the overlapping areas of different classes. To solve the above challenges, a new self-training algorithm based on density peaks combining globally adaptive multi-local noise filter (STDP-GAMNF) is proposed. Firstly, the spatial structure of data set is revealed by density peak clustering, and it is used for helping self-training to label unlabeled samples. In the meantime, after each epoch of labeling, GAMLNF can comprehensively judge whether a sample is a mislabeled sample from multiple classes or not, and will reduce the influence of edge samples effectively. The corresponding experimental results conducted on eighteen real-world data sets demonstrate that GAMLNF is not sensitive to the value of the neighbor parameter k, and it can be adaptive to find the appropriate number of neighbors of each class.



Sign in / Sign up

Export Citation Format

Share Document