density clustering
Recently Published Documents


TOTAL DOCUMENTS

186
(FIVE YEARS 77)

H-INDEX

14
(FIVE YEARS 4)

Symmetry ◽  
2022 ◽  
Vol 14 (1) ◽  
pp. 60
Author(s):  
Kun Gao ◽  
Hassan Ali Khan ◽  
Wenwen Qu

Density clustering has been widely used in many research disciplines to determine the structure of real-world datasets. Existing density clustering algorithms only work well on complete datasets. In real-world datasets, however, there may be missing feature values due to technical limitations. Many imputation methods used for density clustering cause the aggregation phenomenon. To solve this problem, a two-stage novel density peak clustering approach with missing features is proposed: First, the density peak clustering algorithm is used for the data with complete features, while the labeled core points that can represent the whole data distribution are used to train the classifier. Second, we calculate a symmetrical FWPD distance matrix for incomplete data points, then the incomplete data are imputed by the symmetrical FWPD distance matrix and classified by the classifier. The experimental results show that the proposed approach performs well on both synthetic datasets and real datasets.


2021 ◽  
Vol 2137 (1) ◽  
pp. 012071
Author(s):  
Shuxin Liu ◽  
Xiangdong Liu

Abstract Cluster analysis is an unsupervised learning process, and its most classic algorithm K-means has the advantages of simple principle and easy implementation. In view of the K-means algorithm’s shortcoming, where is arbitrary processing of clusters k value, initial cluster center and outlier points. This paper discusses the improvement of traditional K-means algorithm and puts forward an improved algorithm with density clustering algorithm. First, it describes the basic principles and process of the K-means algorithm and the DBSCAN algorithm. Then summarizes improvement methods with the three aspects and their advantages and disadvantages, at the same time proposes a new density-based K-means improved algorithm. Finally, it prospects the development direction and trend of the density-based K-means clustering algorithm.


2021 ◽  
pp. 1-12
Author(s):  
JinFang Sheng ◽  
Huaiyu Zuo ◽  
Bin Wang ◽  
Qiong Li

 In a complex network system, the structure of the network is an extremely important element for the analysis of the system, and the study of community detection algorithms is key to exploring the structure of the complex network. Traditional community detection algorithms would represent the network using an adjacency matrix based on observations, which may contain redundant information or noise that interferes with the detection results. In this paper, we propose a community detection algorithm based on density clustering. In order to improve the performance of density clustering, we consider an algorithmic framework for learning the continuous representation of network nodes in a low-dimensional space. The network structure is effectively preserved through network embedding, and density clustering is applied in the embedded low-dimensional space to compute the similarity of nodes in the network, which in turn reveals the implied structure in a given network. Experiments show that the algorithm has superior performance compared to other advanced community detection algorithms for real-world networks in multiple domains as well as synthetic networks, especially when the network data chaos is high.


2021 ◽  
Author(s):  
Tao Zhang ◽  
Dianzheng Fu ◽  
Yuanye Xu ◽  
Jingya Dong

2021 ◽  
Vol 25 (6) ◽  
pp. 1487-1506
Author(s):  
Hao Chen ◽  
Yu Xia ◽  
Yuekai Pan ◽  
Qing Yang

In many clustering problems, the whole data is not always static. Over time, part of it is likely to be changed, such as updated, erased, etc. Suffer this effect, the timeline can be divided into multiple time segments. And, the data at each time slice is static. Then, the data along the timeline shows a series of dynamic intermediate states. The union set of data from all time slices is called the time-series data. Obviously, the traditional clustering process does not apply directly to the time-series data. Meanwhile, repeating the clustering process at every time slices costs tremendous. In this paper, we analyze the transition rules of the data set and cluster structure when the time slice shifts to the next. We find there is a distinct correlation of data set and succession of cluster structure between two adjacent ones, which means we can use it to reduce the cost of the whole clustering process. Inspired by it, we propose a dynamic density clustering method (DDC) for time-series data. In the simulations, we choose 6 representative problems to construct the time-series data for testing DDC. The results show DDC can get high accuracy results for all 6 problems while reducing the overall cost markedly.


Author(s):  
Xiaoyu Luo ◽  
Sheng Zheng ◽  
Yao Huang ◽  
Shuguang Zeng ◽  
Xiangyun Zeng ◽  
...  

2021 ◽  
Vol 2021 ◽  
pp. 1-9
Author(s):  
Li Xiang ◽  
Li ZongXun

The majority of the traditional methods deal with text matching at the word level which remains uncertain as the text semantic features are ignored. This also leads to the problems of low recall and high space utilization of text matching while the comprehensiveness of matching results is poor. The resultant method, thus, cannot process long text and short text simultaneously. The current study proposes a text matching algorithm for Korean Peninsula language knowledge base based on density clustering. Using the deep multiview semantic document representation model, the semantic vector of the text to be matched is captured for semantic dependency which is utilized to extract the text semantic features. As per the feature extraction outcomes, the text similarity is calculated by subtree matching method, and a semantic classification model based on SWEM and pseudo-twin network is designed for semantic text classification. Finally, the text matching of Korean Peninsula language knowledge base is carried out by applying density clustering algorithm. Experimental results show that the proposed method has high matching recall rate with low space requirements and can effectively match long and short texts concurrently.


2021 ◽  
Author(s):  
James Anibal ◽  
Alexandre Day ◽  
Erol Bahadiroglu ◽  
Liam O'Neill ◽  
Long Phan ◽  
...  

Data clustering plays a significant role in biomedical sciences, particularly in single-cell data analysis. Researchers use clustering algorithms to group individual cells into populations that can be evaluated across different levels of disease progression, drug response, and other clinical statuses. In many cases, multiple sets of clusters must be generated to assess varying levels of cluster specificity. For example, there are many subtypes of leukocytes (e.g. T cells), whose individual preponderance and phenotype must be assessed for statistical/functional significance. In this report, we introduce a novel hierarchical density clustering algorithm (HAL-x) that uses supervised linkage methods to build a cluster hierarchy on raw single-cell data. With this new approach, HAL-x can quickly predict multiple sets of labels for immense datasets, achieving a considerable improvement in computational efficiency on large datasets compared to existing methods. We also show that cell clusters generated by HAL-x yield near-perfect F1-scores when classifying different clinical statuses based on single-cell profiles. Our hierarchical density clustering algorithm achieves high accuracy in single cell classification in a scalable, tunable and rapid manner. We make HAL-x publicly available at: https://pypi.org/project/hal-x/


Sign in / Sign up

Export Citation Format

Share Document