An Attempt at Improving Density-based Clustering Algorithms

The hierarchical topology of wireless sensor networks can effectively reduce the consumption in communication. Clustering algorithm is the foundation to realize herarchical structure, so it has been extensive researched. On the basis of Leach algorithm, a distance density based clustering algorithm (DDBC) is proposed, considering synthetically the distribution density of around nodes and the remaining energy factors of the node to dynamically banlance energy usage of nodes when selecting cluster heads. We analyzed the performance of DDBC through compared with the existing other clustering algorithms in simulation experiment. Results show that the proposed method can generare stable quantity cluster heads and banlance the energy load effectively.

Download Full-text

An improved OPTICS clustering algorithm for discovering clusters with uneven densities

Intelligent Data Analysis ◽

10.3233/ida-205497 ◽

2021 ◽

Vol 25 (6) ◽

pp. 1453-1471

Author(s):

Chunhua Tang ◽

Han Wang ◽

Zhiwen Wang ◽

Xiangkun Zeng ◽

Huaran Yan ◽

...

Keyword(s):

Time Complexity ◽

Clustering Algorithm ◽

Nearest Neighbor ◽

Clustering Algorithms ◽

Substantial Improvement ◽

Experimental Results ◽

High Time ◽

Parameter Setting ◽

K Nearest Neighbor ◽

Density Based Clustering

Most density-based clustering algorithms have the problems of difficult parameter setting, high time complexity, poor noise recognition, and weak clustering for datasets with uneven density. To solve these problems, this paper proposes FOP-OPTICS algorithm (Finding of the Ordering Peaks Based on OPTICS), which is a substantial improvement of OPTICS (Ordering Points To Identify the Clustering Structure). The proposed algorithm finds the demarcation point (DP) from the Augmented Cluster-Ordering generated by OPTICS and uses the reachability-distance of DP as the radius of neighborhood eps of its corresponding cluster. It overcomes the weakness of most algorithms in clustering datasets with uneven densities. By computing the distance of the k-nearest neighbor of each point, it reduces the time complexity of OPTICS; by calculating density-mutation points within the clusters, it can efficiently recognize noise. The experimental results show that FOP-OPTICS has the lowest time complexity, and outperforms other algorithms in parameter setting and noise recognition.

Download Full-text

A survey of density based clustering algorithms

Frontiers of Computer Science ◽

10.1007/s11704-019-9059-3 ◽

2020 ◽

Vol 15 (1) ◽

Cited By ~ 1

Author(s):

Panthadeep Bhattacharjee ◽

Pinaki Mitra

Keyword(s):

Clustering Algorithms ◽

Density Based Clustering

Download Full-text

Hierarchical Density-Based Clustering of White Matter Tracts in the Human Brain

International Journal of Knowledge Discovery in Bioinformatics ◽

10.4018/jkdb.2010100101 ◽

2010 ◽

Vol 1 (4) ◽

pp. 1-25 ◽

Cited By ~ 5

Author(s):

Junming Shao ◽

Klaus Hahn ◽

Qinli Yang ◽

Afra Wohlschläeger ◽

Christian Boehm ◽

...

Keyword(s):

White Matter ◽

Human Brain ◽

Similarity Measure ◽

Multiple Scales ◽

Diffusion Tensor ◽

Clustering Algorithms ◽

Vast Number ◽

Lower Bounding ◽

Density Based Clustering ◽

Neurosurgical Planning

Diffusion tensor magnetic resonance imaging (DTI) provides a promising way of estimating the neural fiber pathways in the human brain non-invasively via white matter tractography. However, it is difficult to analyze the vast number of resulting tracts quantitatively. Automatic tract clustering would be useful for the neuroscience community, as it can contribute to accurate neurosurgical planning, tract-based analysis, or white matter atlas creation. In this paper, the authors propose a new framework for automatic white matter tract clustering using a hierarchical density-based approach. A novel fiber similarity measure based on dynamic time warping allows for an effective and efficient evaluation of fiber similarity. A lower bounding technique is used to further speed up the computation. Then the algorithm OPTICS is applied, to sort the data into a reachability plot, visualizing the clustering structure of the data. Interactive and automatic clustering algorithms are finally introduced to obtain the clusters. Extensive experiments on synthetic data and real data demonstrate the effectiveness and efficiency of our fiber similarity measure and show that the hierarchical density-based clustering method can group these tracts into meaningful bundles on multiple scales as well as eliminating noisy fibers.

Download Full-text

A unified view of density-based methods for semi-supervised clustering and classification

Data Mining and Knowledge Discovery ◽

10.1007/s10618-019-00651-1 ◽

2019 ◽

Vol 33 (6) ◽

pp. 1894-1952 ◽

Cited By ~ 3

Author(s):

Jadson Castro Gertrudes ◽

Arthur Zimek ◽

Jörg Sander ◽

Ricardo J. G. B. Campello

Keyword(s):

Supervised Classification ◽

Clustering Algorithms ◽

Building Blocks ◽

Large Collection ◽

Supervised Clustering ◽

The Core ◽

Density Based Clustering ◽

Clustering And Classification ◽

Unified View ◽

New Framework

Abstract Semi-supervised learning is drawing increasing attention in the era of big data, as the gap between the abundance of cheap, automatically collected unlabeled data and the scarcity of labeled data that are laborious and expensive to obtain is dramatically increasing. In this paper, we first introduce a unified view of density-based clustering algorithms. We then build upon this view and bridge the areas of semi-supervised clustering and classification under a common umbrella of density-based techniques. We show that there are close relations between density-based clustering algorithms and the graph-based approach for transductive classification. These relations are then used as a basis for a new framework for semi-supervised classification based on building-blocks from density-based clustering. This framework is not only efficient and effective, but it is also statistically sound. In addition, we generalize the core algorithm in our framework, HDBSCAN*, so that it can also perform semi-supervised clustering by directly taking advantage of any fraction of labeled data that may be available. Experimental results on a large collection of datasets show the advantages of the proposed approach both for semi-supervised classification as well as for semi-supervised clustering.

Download Full-text

A state-of-the-art survey on semantic similarity for document clustering using GloVe and density-based algorithms

Indonesian Journal of Electrical Engineering and Computer Science ◽

10.11591/ijeecs.v22.i1.pp552-562 ◽

2021 ◽

Vol 22 (1) ◽

pp. 552

Author(s):

Shapol M. Mohammed ◽

Karwan Jacksi ◽

Subhi R. M. Zeebaree

Keyword(s):

Semantic Similarity ◽

State Of The Art ◽

Clustering Algorithms ◽

Document Clustering ◽

Accuracy Evaluation ◽

Similar Data ◽

Document Similarity ◽

Density Based Clustering ◽

Data Points ◽

The Common

<p><span>Semantic similarity is the process of identifying relevant data semantically. The traditional way of identifying document similarity is by using synonymous keywords and syntactician. In comparison, semantic similarity is to find similar data using meaning of words and semantics. Clustering is a concept of grouping objects that have the same features and properties as a cluster and separate from those objects that have different features and properties. In semantic document clustering, documents are clustered using semantic similarity techniques with similarity measurements. One of the common techniques to cluster documents is the density-based clustering algorithms using the density of data points as a main strategic to measure the similarity between them. In this paper, a state-of-the-art survey is presented to analyze the density-based algorithms for clustering documents. Furthermore, the similarity and evaluation measures are investigated with the selected algorithms to grasp the common ones. The delivered review revealed that the most used density-based algorithms in document clustering are DBSCAN and DPC. The most effective similarity measurement has been used with density-based algorithms, specifically DBSCAN and DPC, is Cosine similarity with F-measure for performance and accuracy evaluation.</span></p>

Download Full-text

Implementation of Clustering Algorithms for Real Time Large Datasets

International Journal of Innovative Technology and Exploring Engineering - Special Issue ◽

10.35940/ijitee.c2570.0981119 ◽

2019 ◽

Vol 8 (11) ◽

pp. 2303-2304

Keyword(s):

Big Data ◽

Clustering Algorithms ◽

Vital Role ◽

Large Datasets ◽

Similar Data ◽

Data Set ◽

Survey Paper ◽

Density Based Clustering ◽

Geographical Maps ◽

Data Objects

Now a day’s clustering plays vital role in big data. It is very difficult to analyze and cluster large volume of data. Clustering is a procedure for grouping similar data objects of a data set. We make sure that inside the cluster high intra cluster similarity and outside the cluster high inter similarity. Clustering used in statistical analysis, geographical maps, biology cell analysis and in google maps. The various approaches for clustering grid clustering, density based clustering, hierarchical methods, partitioning approaches. In this survey paper we focused on all these algorithms for large datasets like big data and make a report on comparison among them. The main metric is time complexity to differentiate all algorithms.

Download Full-text

Density-based clustering with constraints

Computer Science and Information Systems ◽

10.2298/csis180601007l ◽

2019 ◽

Vol 16 (2) ◽

pp. 469-489 ◽

Cited By ~ 1

Author(s):

Piotr Lasek ◽

Jarek Gryz

Keyword(s):

Data Clustering ◽

Clustering Algorithms ◽

Background Knowledge ◽

Data Sets ◽

Benchmark Data ◽

Density Based Clustering

In this paper we present our ic-NBC and ic-DBSCAN algorithms for data clustering with constraints. The algorithms are based on density-based clustering algorithms NBC and DBSCAN but allow users to incorporate background knowledge into the process of clustering by means of instance constraints. The knowledge about anticipated groups can be applied by specifying the so-called must-link and cannot-link relationships between objects or points. These relationships are then incorporated into the clustering process. In the proposed algorithms this is achieved by properly merging resulting clusters and introducing a new notion of deferred points which are temporarily excluded from clustering and assigned to clusters based on their involvement in cannot-link relationships. To examine the algorithms, we have carried out a number of experiments. We used benchmark data sets and tested the efficiency and quality of the results. We have also measured the efficiency of the algorithms against their original versions. The experiments prove that the introduction of instance constraints improves the quality of both algorithms. The efficiency is only insignificantly reduced and is due to extra computation related to the introduced constraints.

Download Full-text

DeepDBSCAN: Deep Density-Based Clustering for Geo-Tagged Photos

ISPRS International Journal of Geo-Information ◽

10.3390/ijgi10080548 ◽

2021 ◽

Vol 10 (8) ◽

pp. 548

Author(s):

Jang-You Park ◽

Dong-June Ryu ◽

Kwang-Woo Nam ◽

Insung Jang ◽

Minseok Jang ◽

...

Keyword(s):

Deep Learning ◽

Clustering Algorithm ◽

Nearest Neighbor ◽

Spatial Clustering ◽

Clustering Algorithms ◽

Learning Technology ◽

Neighbor Graph ◽

Density Based Clustering ◽

Real Objects ◽

Nearest Neighbor Graph

Density-based clustering algorithms have been the most commonly used algorithms for discovering regions and points of interest in cities using global positioning system (GPS) information in geo-tagged photos. However, users sometimes find more specific areas of interest using real objects captured in pictures. Recent advances in deep learning technology make it possible to recognize these objects in photos. However, since deep learning detection is a very time-consuming task, simply combining deep learning detection with density-based clustering is very costly. In this paper, we propose a novel algorithm supporting deep content and density-based clustering, called deep density-based spatial clustering of applications with noise (DeepDBSCAN). DeepDBSCAN incorporates object detection by deep learning into the density clustering algorithm using the nearest neighbor graph technique. Additionally, this supports a graph-based reduction algorithm that reduces the number of deep detections. We performed experiments with pictures shared by users on Flickr and compared the performance of multiple algorithms to demonstrate the excellence of the proposed algorithm.

Download Full-text

An Attempt at Improving Density-based Clustering Algorithms

A review on density-based clustering algorithms for big data analysis

Distance Density Based Clustering Algorithm in Wireless Sensor Network

An improved OPTICS clustering algorithm for discovering clusters with uneven densities

A survey of density based clustering algorithms

Hierarchical Density-Based Clustering of White Matter Tracts in the Human Brain

A unified view of density-based methods for semi-supervised clustering and classification

A state-of-the-art survey on semantic similarity for document clustering using GloVe and density-based algorithms

Implementation of Clustering Algorithms for Real Time Large Datasets

Density-based clustering with constraints

DeepDBSCAN: Deep Density-Based Clustering for Geo-Tagged Photos

Export Citation Format