Spectral clustering algorithm based on K-nearest neighbor measure

Unsupervised spectral clustering methods can yield good performance when identifying crisp clusters with low complexity since the learning algorithm does not rely on finding the local minima of an objective function and rather uses spectral properties of the graph. Nonetheless, the performance of such approaches are usually affected by their uncertain parameters. Using the underlying structure of a general spectral clustering method, in this paper a new soft-link spectral clustering algorithm is introduced to identify clusters based on fuzzy k-nearest neighbor approach. We construct a soft weight matrix of a graph by identifying the upper and lower boundaries of learning parameters of the similarity function, specifically the fuzzifier parameter (fuzziness) of the Fuzzy k-Nearest Neighbor algorithm. The algorithm allows perturbations on the graph Laplace during the learning stage by the changes on such learning parameters. With the empirical analysis using an artificial and a real textual entailment dataset, we demonstrate that our initial hypothesis of implementing soft links for spectral clustering can improve the classification performance of final outcome.

Download Full-text

DRSA: a non-hierarchical clustering algorithm using k-NN graph and its application in vegetation classification

Vegetation of Russia ◽

10.31111/vegrus/2015.27.125 ◽

2015 ◽

pp. 125-138 ◽

Cited By ~ 2

Author(s):

I. V. Goncharenko

Keyword(s):

Cluster Analysis ◽

Clustering Algorithm ◽

Nearest Neighbor ◽

Clustering Algorithms ◽

Protein Structures ◽

Hierarchical Cluster ◽

Vegetation Classification ◽

K Nearest Neighbor ◽

Neighbor Graph ◽

Nearest Neighbor Graph

In this article we proposed a new method of non-hierarchical cluster analysis using k-nearest-neighbor graph and discussed it with respect to vegetation classification. The method of k-nearest neighbor (k-NN) classiﬁcation was originally developed in 1951 (Fix, Hodges, 1951). Later a term “k-NN graph” and a few algorithms of k-NN clustering appeared (Cover, Hart, 1967; Brito et al., 1997). In biology k-NN is used in analysis of protein structures and genome sequences. Most of k-NN clustering algorithms build «excessive» graph firstly, so called hypergraph, and then truncate it to subgraphs, just partitioning and coarsening hypergraph. We developed other strategy, the “upward” clustering in forming (assembling consequentially) one cluster after the other. Until today graph-based cluster analysis has not been considered concerning classification of vegetation datasets.

Download Full-text

An Enhanced Spectral Clustering Algorithm with S-Distance

Symmetry ◽

10.3390/sym13040596 ◽

2021 ◽

Vol 13 (4) ◽

pp. 596

Author(s):

Krishna Kumar Sharma ◽

Ayan Seal ◽

Enrique Herrera-Viedma ◽

Ondrej Krejcar

Keyword(s):

Spectral Clustering ◽

Clustering Algorithm ◽

Spatial Clustering ◽

Clustering Algorithms ◽

Rank Test ◽

Customer Churn ◽

Signed Rank ◽

Signed Rank Test ◽

Spectral Clustering Algorithm ◽

Industrial Databases

Calculating and monitoring customer churn metrics is important for companies to retain customers and earn more profit in business. In this study, a churn prediction framework is developed by modified spectral clustering (SC). However, the similarity measure plays an imperative role in clustering for predicting churn with better accuracy by analyzing industrial data. The linear Euclidean distance in the traditional SC is replaced by the non-linear S-distance (Sd). The Sd is deduced from the concept of S-divergence (SD). Several characteristics of Sd are discussed in this work. Assays are conducted to endorse the proposed clustering algorithm on four synthetics, eight UCI, two industrial databases and one telecommunications database related to customer churn. Three existing clustering algorithms—k-means, density-based spatial clustering of applications with noise and conventional SC—are also implemented on the above-mentioned 15 databases. The empirical outcomes show that the proposed clustering algorithm beats three existing clustering algorithms in terms of its Jaccard index, f-score, recall, precision and accuracy. Finally, we also test the significance of the clustering results by the Wilcoxon’s signed-rank test, Wilcoxon’s rank-sum test, and sign tests. The relative study shows that the outcomes of the proposed algorithm are interesting, especially in the case of clusters of arbitrary shape.

Download Full-text

The Spectrum Segmentation Algorithm of Multimode Vibration Signal Based on Spectral Clustering

Applied Mechanics and Materials ◽

10.4028/www.scientific.net/amm.121-126.2372 ◽

2011 ◽

Vol 121-126 ◽

pp. 2372-2376

Author(s):

Dan Dan Wang ◽

Yu Zhou ◽

Qing Wei Ye ◽

Xiao Dong Wang

Keyword(s):

Wave Packet ◽

Frequency Domain ◽

Spectral Clustering ◽

Clustering Algorithm ◽

Wave Packets ◽

Vibration Signal ◽

Segmentation Algorithm ◽

Frequency Curve ◽

Macroscopic Observation ◽

Spectral Clustering Algorithm

The mode peaks in frequency domain of vibration signal are strongly interfered by strong noise, causing the inaccuracy mode parameters. According to this situation, this paper comes up with the thought of mode-peak segmentation based on the spectral clustering algorithm. First, according to the concept of wave packet, the amplitude-frequency of vibration signal is divided into wave packets. Taking each wave packet as a sample of clustering algorithm, the spectral clustering algorithm is used to classify these wave packets. The amplitude-frequency curve of a mode peak becomes a big wave packet in macroscopic. The experiment to simulation signals indicates that this spectral clustering algorithm could accord with the macroscopic observation of mode segmentation effectively, and has outstanding performance especially in strong noise.

Download Full-text

Research on Spectral Clustering

Applied Mechanics and Materials ◽

10.4028/www.scientific.net/amm.687-691.1350 ◽

2014 ◽

Vol 687-691 ◽

pp. 1350-1353

Author(s):

Li Li Fu ◽

Yong Li Liu ◽

Li Jing Hao

Keyword(s):

Spectral Clustering ◽

Clustering Algorithm ◽

Theoretical Foundation ◽

Clustering Algorithms ◽

Spectral Graph Theory ◽

Graph Partition ◽

Mining Areas ◽

Spectral Graph ◽

Definition Of ◽

Spectral Clustering Algorithm

Spectral clustering algorithm is a kind of clustering algorithm based on spectral graph theory. As spectral clustering has deep theoretical foundation as well as the advantage in dealing with non-convex distribution, it has received much attention in machine learning and data mining areas. The algorithm is easy to implement, and outperforms traditional clustering algorithms such as K-means algorithm. This paper aims to give some intuitions on spectral clustering. We describe different graph partition criteria, the definition of spectral clustering, and clustering steps, etc. Finally, in order to solve the disadvantage of spectral clustering, some improvements are introduced briefly.

Download Full-text

CAST: A Correlation-based Adaptive Spectral Clustering Algorithm on Multi-scale Data

Proceedings of the 26th ACM SIGKDD International Conference on Knowledge Discovery & Data Mining ◽

10.1145/3394486.3403086 ◽

2020 ◽

Author(s):

Xiang Li ◽

Ben Kao ◽

Caihua Shan ◽

Dawei Yin ◽

Martin Ester

Keyword(s):

Spectral Clustering ◽

Clustering Algorithm ◽

Multi Scale ◽

Spectral Clustering Algorithm ◽

Scale Data

Download Full-text

An improved OPTICS clustering algorithm for discovering clusters with uneven densities

Intelligent Data Analysis ◽

10.3233/ida-205497 ◽

2021 ◽

Vol 25 (6) ◽

pp. 1453-1471

Author(s):

Chunhua Tang ◽

Han Wang ◽

Zhiwen Wang ◽

Xiangkun Zeng ◽

Huaran Yan ◽

...

Keyword(s):

Time Complexity ◽

Clustering Algorithm ◽

Nearest Neighbor ◽

Clustering Algorithms ◽

Substantial Improvement ◽

Experimental Results ◽

High Time ◽

Parameter Setting ◽

K Nearest Neighbor ◽

Density Based Clustering

Most density-based clustering algorithms have the problems of difficult parameter setting, high time complexity, poor noise recognition, and weak clustering for datasets with uneven density. To solve these problems, this paper proposes FOP-OPTICS algorithm (Finding of the Ordering Peaks Based on OPTICS), which is a substantial improvement of OPTICS (Ordering Points To Identify the Clustering Structure). The proposed algorithm finds the demarcation point (DP) from the Augmented Cluster-Ordering generated by OPTICS and uses the reachability-distance of DP as the radius of neighborhood eps of its corresponding cluster. It overcomes the weakness of most algorithms in clustering datasets with uneven densities. By computing the distance of the k-nearest neighbor of each point, it reduces the time complexity of OPTICS; by calculating density-mutation points within the clusters, it can efficiently recognize noise. The experimental results show that FOP-OPTICS has the lowest time complexity, and outperforms other algorithms in parameter setting and noise recognition.

Download Full-text

An Improved Spectral Clustering Algorithm Based on Dynamic Tissue-Like Membrane System

Lecture Notes in Computer Science - Intelligence Science and Big Data Engineering ◽

10.1007/978-3-030-02698-1_38 ◽

2018 ◽

pp. 433-442

Author(s):

Xuewei Hu ◽

Xiyu Liu

Keyword(s):

Spectral Clustering ◽

Clustering Algorithm ◽

Membrane System ◽

Spectral Clustering Algorithm

Download Full-text

Construction of Protein Backbone Fragments Libraries on Large Protein Sets Using a Randomized Spectral Clustering Algorithm

Bioinformatics Research and Applications - Lecture Notes in Computer Science ◽

10.1007/978-3-319-59575-7_10 ◽

2017 ◽

pp. 108-119 ◽

Cited By ~ 2

Author(s):

Wessam Elhefnawy ◽

Min Li ◽

Jianxin Wang ◽

Yaohang Li

Keyword(s):

Spectral Clustering ◽

Clustering Algorithm ◽

Protein Backbone ◽

Large Protein ◽

Spectral Clustering Algorithm

Download Full-text

K-Nearest Neighbor Intervals Based AP Clustering Algorithm for Large Incomplete Data

Mathematical Problems in Engineering ◽

10.1155/2015/535932 ◽

2015 ◽

Vol 2015 ◽

pp. 1-9 ◽

Cited By ~ 2

Author(s):

Cheng Lu ◽

Shiji Song ◽

Cheng Wu

Keyword(s):

Clustering Analysis ◽

Incomplete Data ◽

Clustering Algorithm ◽

Nearest Neighbor ◽

Interval Data ◽

Similarity Function ◽

K Nearest Neighbor ◽

Partial Data ◽

Missing Attributes ◽

Ap Clustering

The Affinity Propagation (AP) algorithm is an effective algorithm for clustering analysis, but it can not be directly applicable to the case of incomplete data. In view of the prevalence of missing data and the uncertainty of missing attributes, we put forward a modified AP clustering algorithm based onK-nearest neighbor intervals (KNNI) for incomplete data. Based on an Improved Partial Data Strategy, the proposed algorithm estimates the KNNI representation of missing attributes by using the attribute distribution information of the available data. The similarity function can be changed by dealing with the interval data. Then the improved AP algorithm can be applicable to the case of incomplete data. Experiments on several UCI datasets show that the proposed algorithm achieves impressive clustering results.

Download Full-text