An improved OPTICS clustering algorithm for discovering clusters with uneven densities

Most density-based clustering algorithms have the problems of difficult parameter setting, high time complexity, poor noise recognition, and weak clustering for datasets with uneven density. To solve these problems, this paper proposes FOP-OPTICS algorithm (Finding of the Ordering Peaks Based on OPTICS), which is a substantial improvement of OPTICS (Ordering Points To Identify the Clustering Structure). The proposed algorithm finds the demarcation point (DP) from the Augmented Cluster-Ordering generated by OPTICS and uses the reachability-distance of DP as the radius of neighborhood eps of its corresponding cluster. It overcomes the weakness of most algorithms in clustering datasets with uneven densities. By computing the distance of the k-nearest neighbor of each point, it reduces the time complexity of OPTICS; by calculating density-mutation points within the clusters, it can efficiently recognize noise. The experimental results show that FOP-OPTICS has the lowest time complexity, and outperforms other algorithms in parameter setting and noise recognition.

Download Full-text

DRSA: a non-hierarchical clustering algorithm using k-NN graph and its application in vegetation classification

Vegetation of Russia ◽

10.31111/vegrus/2015.27.125 ◽

2015 ◽

pp. 125-138 ◽

Cited By ~ 2

Author(s):

I. V. Goncharenko

Keyword(s):

Cluster Analysis ◽

Clustering Algorithm ◽

Nearest Neighbor ◽

Clustering Algorithms ◽

Protein Structures ◽

Hierarchical Cluster ◽

Vegetation Classification ◽

K Nearest Neighbor ◽

Neighbor Graph ◽

Nearest Neighbor Graph

In this article we proposed a new method of non-hierarchical cluster analysis using k-nearest-neighbor graph and discussed it with respect to vegetation classification. The method of k-nearest neighbor (k-NN) classiﬁcation was originally developed in 1951 (Fix, Hodges, 1951). Later a term “k-NN graph” and a few algorithms of k-NN clustering appeared (Cover, Hart, 1967; Brito et al., 1997). In biology k-NN is used in analysis of protein structures and genome sequences. Most of k-NN clustering algorithms build «excessive» graph firstly, so called hypergraph, and then truncate it to subgraphs, just partitioning and coarsening hypergraph. We developed other strategy, the “upward” clustering in forming (assembling consequentially) one cluster after the other. Until today graph-based cluster analysis has not been considered concerning classification of vegetation datasets.

Download Full-text

DeepDBSCAN: Deep Density-Based Clustering for Geo-Tagged Photos

ISPRS International Journal of Geo-Information ◽

10.3390/ijgi10080548 ◽

2021 ◽

Vol 10 (8) ◽

pp. 548

Author(s):

Jang-You Park ◽

Dong-June Ryu ◽

Kwang-Woo Nam ◽

Insung Jang ◽

Minseok Jang ◽

...

Keyword(s):

Deep Learning ◽

Clustering Algorithm ◽

Nearest Neighbor ◽

Spatial Clustering ◽

Clustering Algorithms ◽

Learning Technology ◽

Neighbor Graph ◽

Density Based Clustering ◽

Real Objects ◽

Nearest Neighbor Graph

Density-based clustering algorithms have been the most commonly used algorithms for discovering regions and points of interest in cities using global positioning system (GPS) information in geo-tagged photos. However, users sometimes find more specific areas of interest using real objects captured in pictures. Recent advances in deep learning technology make it possible to recognize these objects in photos. However, since deep learning detection is a very time-consuming task, simply combining deep learning detection with density-based clustering is very costly. In this paper, we propose a novel algorithm supporting deep content and density-based clustering, called deep density-based spatial clustering of applications with noise (DeepDBSCAN). DeepDBSCAN incorporates object detection by deep learning into the density clustering algorithm using the nearest neighbor graph technique. Additionally, this supports a graph-based reduction algorithm that reduces the number of deep detections. We performed experiments with pictures shared by users on Flickr and compared the performance of multiple algorithms to demonstrate the excellence of the proposed algorithm.

Download Full-text

NBC: An Efficient Hierarchical Clustering Algorithm for Large Datasets

International Journal of Semantic Computing ◽

10.1142/s1793351x15400085 ◽

2015 ◽

Vol 09 (03) ◽

pp. 307-331 ◽

Cited By ~ 1

Author(s):

Wei Zhang ◽

Gongxuan Zhang ◽

Yongli Wang ◽

Zhaomeng Zhu ◽

Tao Li

Keyword(s):

Hierarchical Clustering ◽

Time Complexity ◽

Clustering Algorithm ◽

Nearest Neighbor ◽

Clustering Algorithms ◽

Large Datasets ◽

Nearest Neighbor Search ◽

Large Dataset ◽

Neighbor Search ◽

Hierarchical Clustering Algorithm

Nearest neighbor search is a key technique used in hierarchical clustering and its computing complexity decides the performance of the hierarchical clustering algorithm. The time complexity of standard agglomerative hierarchical clustering is O(n3), while the time complexity of more advanced hierarchical clustering algorithms (such as nearest neighbor chain, SLINK and CLINK) is O(n2). This paper presents a new nearest neighbor search method called nearest neighbor boundary (NNB), which first divides a large dataset into independent subset and then finds nearest neighbor of each point in subset. When NNB is used, the time complexity of hierarchical clustering can be reduced to O(n log 2n). Based on NNB, we propose a fast hierarchical clustering algorithm called nearest-neighbor boundary clustering (NBC), and the proposed algorithm can be adapted to the parallel and distributed computing framework. The experimental results demonstrate that our algorithm is practical for large datasets.

Download Full-text

Distance Density Based Clustering Algorithm in Wireless Sensor Network

Advanced Materials Research ◽

10.4028/www.scientific.net/amr.291-294.344 ◽

2011 ◽

Vol 291-294 ◽

pp. 344-348

Author(s):

Lin Lin ◽

Shu Yan ◽

Yi Nian

Keyword(s):

Clustering Algorithm ◽

Distribution Density ◽

Simulation Experiment ◽

Clustering Algorithms ◽

Wireless Sensor ◽

Energy Usage ◽

Cluster Heads ◽

Hierarchical Topology ◽

Energy Factors ◽

Density Based Clustering

The hierarchical topology of wireless sensor networks can effectively reduce the consumption in communication. Clustering algorithm is the foundation to realize herarchical structure, so it has been extensive researched. On the basis of Leach algorithm, a distance density based clustering algorithm (DDBC) is proposed, considering synthetically the distribution density of around nodes and the remaining energy factors of the node to dynamically banlance energy usage of nodes when selecting cluster heads. We analyzed the performance of DDBC through compared with the existing other clustering algorithms in simulation experiment. Results show that the proposed method can generare stable quantity cluster heads and banlance the energy load effectively.

Download Full-text

Recognition of 3D Objects from 2D Views Features

Journal of Electronic Commerce in Organizations ◽

10.4018/jeco.2015040105 ◽

2015 ◽

Vol 13 (2) ◽

pp. 50-58

Author(s):

R. Khadim ◽

R. El Ayachi ◽

Mohamed Fakir

Keyword(s):

Neural Network ◽

Support Vector Machine ◽

Nearest Neighbor ◽

Color Image ◽

Recognition Rate ◽

Experimental Results ◽

Support Vector ◽

K Nearest Neighbor ◽

3D Objects ◽

Color Descriptor

This paper focuses on the recognition of 3D objects using 2D attributes. In order to increase the recognition rate, the present an hybridization of three approaches to calculate the attributes of color image, this hybridization based on the combination of Zernike moments, Gist descriptors and color descriptor (statistical moments). In the classification phase, three methods are adopted: Neural Network (NN), Support Vector Machine (SVM), and k-nearest neighbor (KNN). The database COIL-100 is used in the experimental results.

Download Full-text

K-Nearest Neighbor Intervals Based AP Clustering Algorithm for Large Incomplete Data

Mathematical Problems in Engineering ◽

10.1155/2015/535932 ◽

2015 ◽

Vol 2015 ◽

pp. 1-9 ◽

Cited By ~ 2

Author(s):

Cheng Lu ◽

Shiji Song ◽

Cheng Wu

Keyword(s):

Clustering Analysis ◽

Incomplete Data ◽

Clustering Algorithm ◽

Nearest Neighbor ◽

Interval Data ◽

Similarity Function ◽

K Nearest Neighbor ◽

Partial Data ◽

Missing Attributes ◽

Ap Clustering

The Affinity Propagation (AP) algorithm is an effective algorithm for clustering analysis, but it can not be directly applicable to the case of incomplete data. In view of the prevalence of missing data and the uncertainty of missing attributes, we put forward a modified AP clustering algorithm based onK-nearest neighbor intervals (KNNI) for incomplete data. Based on an Improved Partial Data Strategy, the proposed algorithm estimates the KNNI representation of missing attributes by using the attribute distribution information of the available data. The similarity function can be changed by dealing with the interval data. Then the improved AP algorithm can be applicable to the case of incomplete data. Experiments on several UCI datasets show that the proposed algorithm achieves impressive clustering results.

Download Full-text

A novel content based image retrieval system using K-means/KNN with feature extraction

Computer Science and Information Systems ◽

10.2298/csis120122047c ◽

2012 ◽

Vol 9 (4) ◽

pp. 1645-1661 ◽

Cited By ~ 10

Author(s):

Ray-I Chang ◽

Shu-Yu Lin ◽

Jan-Ming Ho ◽

Chi-Wen Fann ◽

Yu-Chun Wang

Keyword(s):

Feature Extraction ◽

Image Retrieval ◽

Nearest Neighbor ◽

Clustering Algorithms ◽

Content Based Image Retrieval ◽

K Nearest Neighbor ◽

Color Analysis ◽

Image Retrieval System ◽

First Time ◽

System Designs

Image retrieval has been popular for several years. There are different system designs for content based image retrieval (CBIR) system. This paper propose a novel system architecture for CBIR system which combines techniques include content-based image and color analysis, as well as data mining techniques. To our best knowledge, this is the first time to propose segmentation and grid module, feature extraction module, K-means and k-nearest neighbor clustering algorithms and bring in the neighborhood module to build the CBIR system. Concept of neighborhood color analysis module which also recognizes the side of every grids of image is first contributed in this paper. The results show the CBIR systems performs well in the training and it also indicates there contains many interested issue to be optimized in the query stage of image retrieval.

Download Full-text

Improved minimum-minimum roughness algorithm for clustering categorical data

International Journal of ADVANCED AND APPLIED SCIENCES ◽

10.21833/ijaas.2021.10.006 ◽

2021 ◽

Vol 8 (10) ◽

pp. 43-50

Author(s):

Truong et al. ◽

Keyword(s):

Machine Learning ◽

Data Mining ◽

Hierarchical Clustering ◽

Categorical Data ◽

Clustering Algorithm ◽

Clustering Algorithms ◽

Experimental Results ◽

Data Sets ◽

Top Down ◽

Hierarchical Clustering Algorithm

Clustering is a fundamental technique in data mining and machine learning. Recently, many researchers are interested in the problem of clustering categorical data and several new approaches have been proposed. One of the successful and pioneering clustering algorithms is the Minimum-Minimum Roughness algorithm (MMR) which is a top-down hierarchical clustering algorithm and can handle the uncertainty in clustering categorical data. However, MMR tends to choose the category with less value leaf node with more objects, leading to undesirable clustering results. To overcome such shortcomings, this paper proposes an improved version of the MMR algorithm for clustering categorical data, called IMMR (Improved Minimum-Minimum Roughness). Experimental results on actual data sets taken from UCI show that the IMMR algorithm outperforms MMR in clustering categorical data.

Download Full-text

A Deep Learning Based Method for the Non-Destructive Measuring of Rock Strength through Hammering Sound

Applied Sciences ◽

10.3390/app9173484 ◽

2019 ◽

Vol 9 (17) ◽

pp. 3484

Author(s):

Shuai Han ◽

Heng Li ◽

Mingchao Li ◽

Timothy Rose

Keyword(s):

Clustering Algorithm ◽

Nearest Neighbor ◽

Rock Strength ◽

Support Vector ◽

K Nearest Neighbor ◽

Strength Measurement ◽

Regression Algorithms ◽

Almost All ◽

The Relationship ◽

Non Destructive

Hammering rocks of different strengths can make different sounds. Geological engineers often use this method to approximate the strengths of rocks in geology surveys. This method is quick and convenient but subjective. Inspired by this problem, we present a new, non-destructive method for measuring the surface strengths of rocks based on deep neural network (DNN) and spectrogram analysis. All the hammering sounds are transformed into spectrograms firstly, and a clustering algorithm is presented to filter out the outliers of the spectrograms automatically. One of the most advanced image classification DNN, the Inception-ResNet-v2, is then re-trained with the spectrograms. The results show that the training accurate is up to 94.5%. Following this, three regression algorithms, including Support Vector Machine (SVM), K-Nearest Neighbor (KNN), and Random Forest (RF) are adopted to fit the relationship between the outputs of the DNN and the strength values. The tests show that KNN has the highest fitting accuracy, and SVM has the strongest generalization ability. The strengths (represented by rebound values) of almost all the samples can be predicted within an error of [−5, 5]. Overall, the proposed method has great potential in supporting the implementation of efficient rock strength measurement methods in the field.

Download Full-text