A novel clustering algorithm based on the natural reverse nearest neighbor structure

In this article we proposed a new method of non-hierarchical cluster analysis using k-nearest-neighbor graph and discussed it with respect to vegetation classification. The method of k-nearest neighbor (k-NN) classiﬁcation was originally developed in 1951 (Fix, Hodges, 1951). Later a term “k-NN graph” and a few algorithms of k-NN clustering appeared (Cover, Hart, 1967; Brito et al., 1997). In biology k-NN is used in analysis of protein structures and genome sequences. Most of k-NN clustering algorithms build «excessive» graph firstly, so called hypergraph, and then truncate it to subgraphs, just partitioning and coarsening hypergraph. We developed other strategy, the “upward” clustering in forming (assembling consequentially) one cluster after the other. Until today graph-based cluster analysis has not been considered concerning classification of vegetation datasets.

Download Full-text

K-means text clustering algorithm based on density and nearest neighbor

Journal of Computer Applications ◽

10.3724/sp.j.1087.2010.01933 ◽

2010 ◽

Vol 30 (7) ◽

pp. 1933-1935 ◽

Cited By ~ 6

Author(s):

Wen-ming ZHANG ◽

Jiang WU ◽

Xiao-jiao YUAN

Keyword(s):

Clustering Algorithm ◽

Nearest Neighbor ◽

Text Clustering

Download Full-text

A graphical heuristic for reduction and partitioning of large datasets for scalable supervised training

Journal Of Big Data ◽

10.1186/s40537-019-0259-3 ◽

2019 ◽

Vol 6 (1) ◽

Cited By ~ 2

Author(s):

Sumedh Yadav ◽

Mathis Bode

Keyword(s):

Prediction Accuracy ◽

Clustering Algorithm ◽

Nearest Neighbor ◽

Large Datasets ◽

Test Results ◽

Accuracy Test ◽

Reasonable Proportion ◽

Speed Up ◽

Run Time ◽

Information Graph

Abstract A scalable graphical method is presented for selecting and partitioning datasets for the training phase of a classification task. For the heuristic, a clustering algorithm is required to get its computation cost in a reasonable proportion to the task itself. This step is succeeded by construction of an information graph of the underlying classification patterns using approximate nearest neighbor methods. The presented method consists of two approaches, one for reducing a given training set, and another for partitioning the selected/reduced set. The heuristic targets large datasets, since the primary goal is a significant reduction in training computation run-time without compromising prediction accuracy. Test results show that both approaches significantly speed-up the training task when compared against that of state-of-the-art shrinking heuristics available in LIBSVM. Furthermore, the approaches closely follow or even outperform in prediction accuracy. A network design is also presented for a partitioning based distributed training formulation. Added speed-up in training run-time is observed when compared to that of serial implementation of the approaches.

Download Full-text

Finding optimal region for bichromatic reverse nearest neighbor in two- and three-dimensional spaces

GeoInformatica ◽

10.1007/s10707-015-0239-5 ◽

2015 ◽

Vol 20 (3) ◽

pp. 351-384 ◽

Cited By ~ 2

Author(s):

Huaizhong Lin ◽

Fangshu Chen ◽

Yunjun Gao ◽

Dongming Lu

Keyword(s):

Nearest Neighbor ◽

Three Dimensional ◽

Reverse Nearest Neighbor ◽

Optimal Region

Download Full-text

An improved OPTICS clustering algorithm for discovering clusters with uneven densities

Intelligent Data Analysis ◽

10.3233/ida-205497 ◽

2021 ◽

Vol 25 (6) ◽

pp. 1453-1471

Author(s):

Chunhua Tang ◽

Han Wang ◽

Zhiwen Wang ◽

Xiangkun Zeng ◽

Huaran Yan ◽

...

Keyword(s):

Time Complexity ◽

Clustering Algorithm ◽

Nearest Neighbor ◽

Clustering Algorithms ◽

Substantial Improvement ◽

Experimental Results ◽

High Time ◽

Parameter Setting ◽

K Nearest Neighbor ◽

Density Based Clustering

Most density-based clustering algorithms have the problems of difficult parameter setting, high time complexity, poor noise recognition, and weak clustering for datasets with uneven density. To solve these problems, this paper proposes FOP-OPTICS algorithm (Finding of the Ordering Peaks Based on OPTICS), which is a substantial improvement of OPTICS (Ordering Points To Identify the Clustering Structure). The proposed algorithm finds the demarcation point (DP) from the Augmented Cluster-Ordering generated by OPTICS and uses the reachability-distance of DP as the radius of neighborhood eps of its corresponding cluster. It overcomes the weakness of most algorithms in clustering datasets with uneven densities. By computing the distance of the k-nearest neighbor of each point, it reduces the time complexity of OPTICS; by calculating density-mutation points within the clusters, it can efficiently recognize noise. The experimental results show that FOP-OPTICS has the lowest time complexity, and outperforms other algorithms in parameter setting and noise recognition.

Download Full-text

K-Nearest Neighbor Intervals Based AP Clustering Algorithm for Large Incomplete Data

Mathematical Problems in Engineering ◽

10.1155/2015/535932 ◽

2015 ◽

Vol 2015 ◽

pp. 1-9 ◽

Cited By ~ 2

Author(s):

Cheng Lu ◽

Shiji Song ◽

Cheng Wu

Keyword(s):

Clustering Analysis ◽

Incomplete Data ◽

Clustering Algorithm ◽

Nearest Neighbor ◽

Interval Data ◽

Similarity Function ◽

K Nearest Neighbor ◽

Partial Data ◽

Missing Attributes ◽

Ap Clustering

The Affinity Propagation (AP) algorithm is an effective algorithm for clustering analysis, but it can not be directly applicable to the case of incomplete data. In view of the prevalence of missing data and the uncertainty of missing attributes, we put forward a modified AP clustering algorithm based onK-nearest neighbor intervals (KNNI) for incomplete data. Based on an Improved Partial Data Strategy, the proposed algorithm estimates the KNNI representation of missing attributes by using the attribute distribution information of the available data. The similarity function can be changed by dealing with the interval data. Then the improved AP algorithm can be applicable to the case of incomplete data. Experiments on several UCI datasets show that the proposed algorithm achieves impressive clustering results.

Download Full-text

Reverse Nearest Neighbor Query

Encyclopedia of Database Systems ◽

10.1007/978-1-4614-8265-9_318 ◽

2018 ◽

pp. 3234-3239

Author(s):

Dimitris Papadias ◽

Yufei Tao

Keyword(s):

Nearest Neighbor ◽

Reverse Nearest Neighbor ◽

Nearest Neighbor Query

Download Full-text

A novel clustering algorithm based on the natural reverse nearest neighbor structure

RNN-DBSCAN: A Density-Based Clustering Algorithm Using Reverse Nearest Neighbor Density Estimates

A Density-based Clustering Algorithm Using Adaptive Parameter K-Reverse Nearest Neighbor

KR-DBSCAN: A density-based clustering algorithm based on reverse nearest neighbor and influence space

DRSA: a non-hierarchical clustering algorithm using k-NN graph and its application in vegetation classification

K-means text clustering algorithm based on density and nearest neighbor

A graphical heuristic for reduction and partitioning of large datasets for scalable supervised training

Finding optimal region for bichromatic reverse nearest neighbor in two- and three-dimensional spaces

An improved OPTICS clustering algorithm for discovering clusters with uneven densities

K-Nearest Neighbor Intervals Based AP Clustering Algorithm for Large Incomplete Data

Reverse Nearest Neighbor Query

Export Citation Format