scholarly journals Approximate spectral clustering using both reference vectors and topology of the network generated by growing neural gas

2021 ◽  
Vol 7 ◽  
pp. e679
Author(s):  
Kazuhisa Fujita

Spectral clustering (SC) is one of the most popular clustering methods and often outperforms traditional clustering methods. SC uses the eigenvectors of a Laplacian matrix calculated from a similarity matrix of a dataset. SC has serious drawbacks: the significant increases in the time complexity derived from the computation of eigenvectors and the memory space complexity to store the similarity matrix. To address the issues, I develop a new approximate spectral clustering using the network generated by growing neural gas (GNG), called ASC with GNG in this study. ASC with GNG uses not only reference vectors for vector quantization but also the topology of the network for extraction of the topological relationship between data points in a dataset. ASC with GNG calculates the similarity matrix from both the reference vectors and the topology of the network generated by GNG. Using the network generated from a dataset by GNG, ASC with GNG achieves to reduce the computational and space complexities and improve clustering quality. In this study, I demonstrate that ASC with GNG effectively reduces the computational time. Moreover, this study shows that ASC with GNG provides equal to or better clustering performance than SC.


2013 ◽  
Vol 765-767 ◽  
pp. 580-584
Author(s):  
Yu Yang ◽  
Cheng Gui Zhao

Spectral clustering algorithms inevitable exist computational time and memory use problems for large-scale spectral clustering, owing to compute-intensive and data-intensive. We analyse the time complexity of constructing similarity matrix, doing eigendecomposition and performing k-means and exploiting SPMD parallel structure supported by MATLAB Parallel Computing Toolbox (PCT) to decrease eigendecomposition computational time. We propose using MATLAB Distributed Computing Server to parallel construct similarity matrix, whilst using t-nearest neighbors approach to reduce memory use. Ultimately, we present clustering time, clustering quality and clustering accuracy in the experiments.



Author(s):  
Xiang Li ◽  
Ben Kao ◽  
Zhaochun Ren ◽  
Dawei Yin

A heterogeneous information network (HIN) is one whose objects are of different types and links between objects could model different object relations. We study how spectral clustering can be effectively applied to HINs. In particular, we focus on how meta-path relations are used to construct an effective similarity matrix based on which spectral clustering is done. We formulate the similarity matrix construction as an optimization problem and propose the SClump algorithm for solving the problem. We conduct extensive experiments comparing SClump with other state-of-the-art clustering algorithms on HINs. Our results show that SClump outperforms the competitors over a range of datasets w.r.t. different clustering quality measures.



2019 ◽  
Vol 2019 ◽  
pp. 1-7
Author(s):  
Libo Yang ◽  
Xuemei Liu ◽  
Feiping Nie ◽  
Mingtang Liu

Spectral clustering (SC) has attracted more and more attention due to its effectiveness in machine learning. However, most traditional spectral clustering methods still face challenges in the successful application of large-scale spectral clustering problems mainly due to their high computational complexity οn3, where n is the number of samples. In order to achieve fast spectral clustering, we propose a novel approach, called representative point-based spectral clustering (RPSC), to efficiently deal with the large-scale spectral clustering problem. The proposed method first generates two-layer representative points successively by BKHK (balanced k-means-based hierarchical k-means). Then it constructs the hierarchical bipartite graph and performs spectral analysis on the graph. Specifically, we construct the similarity matrix using the parameter-free neighbor assignment method, which avoids the need to tune the extra parameters. Furthermore, we perform the coclustering on the final similarity matrix. The coclustering mechanism takes advantage of the cooccurring cluster structure among the representative points and the original data to strengthen the clustering performance. As a result, the computational complexity can be significantly reduced and the clustering accuracy can be improved. Extensive experiments on several large-scale data sets show the effectiveness, efficiency, and stability of the proposed method.



2021 ◽  
Vol 13 (5) ◽  
pp. 955
Author(s):  
Shukun Zhang ◽  
James M. Murphy

We propose a method for the unsupervised clustering of hyperspectral images based on spatially regularized spectral clustering with ultrametric path distances. The proposed method efficiently combines data density and spectral-spatial geometry to distinguish between material classes in the data, without the need for training labels. The proposed method is efficient, with quasilinear scaling in the number of data points, and enjoys robust theoretical performance guarantees. Extensive experiments on synthetic and real HSI data demonstrate its strong performance compared to benchmark and state-of-the-art methods. Indeed, the proposed method not only achieves excellent labeling accuracy, but also efficiently estimates the number of clusters. Thus, unlike almost all existing hyperspectral clustering methods, the proposed algorithm is essentially parameter-free.



Author(s):  
Wenke Zang ◽  
Zhenni Jiang ◽  
Liyan Ren

Spectral clustering has become very popular in recent years, due to the simplicity of its implementation as well as the performance of the method, in comparison with other popular ones. But many studies show that clustering results are sensitive to the selection of the similarity graph and its parameters, e.g. [Formula: see text] and [Formula: see text]. To address this issue, inspired by density sensitive similarity measure, we propose an improved spectral graph clustering method that utilizes the similarity measure based on data density combined with DNA genetic algorithms (ISC-DNA-GA), making it increase the distance of the pairs of data in the high density areas, which are located in different spaces. The method can reduce the similarity degree among the pairs of data in the same density region to find the spatial distribution characteristics of the complex data. After computing the Laplacian matrix, we apply DNA-GAs to obtain the clustering centroids and assign all of the points to the centroids, so as to achieve better clustering results. Experiments have been conducted on the artificial and real-world datasets with various multi-dimensions, using evaluation methods based on external clustering criteria. The results show that the proposed method improves the spectral clustering quality, and it is superior to those competing approaches.



2013 ◽  
Vol 433-435 ◽  
pp. 725-730
Author(s):  
Sheng Zhang ◽  
Xiao Qi He ◽  
Yang Guang Liu ◽  
Qi Chun Huang

Constructing the similarity matrix is the key step for spectral clustering, and its goal is to model the local neighborhood relationships between the data points. In order to evaluate the influence of similarity matrix on performance of the different spectral clustering algorithms and find the rules on how to construct an appropriate similarity matrix, a system empirical study was carried out. In the study, six recently proposed spectral clustering algorithms were selected as evaluation object, and normalized mutual information, F-measures and Rand Index were used as evaluation metrics. Then experiments were carried out on eight synthetic datasets and eleven real word datasets respectively. The experimental results show that with multiple metrics the results are more comprehensive and confident, and the comprehensive performance of locality spectral clustering algorithm is better than other five algorithms on synthetic datasets and real word datasets.



Recent attention in the research field of clustering is focused on grouping of clusters based on structure of a graph. At present, there are plentiful literature work has been proposed towards the clustering techniques but it is still an open challenge to find the best technique for clustering. This paper present a comprehensive review of our insights towards emerging clustering methods on graph based spectral clustering. Graph Laplacians have become a core technology for the spectral clustering which works based on the properties of the Laplacian matrix. In our study, we discuss correlation between similarity and Laplacian matrices within a graph and spectral graph theory concepts. Current studies on graph-based clustering methods requires a well defined good quality graph to achieve high clustering accuracy. This paper describes how spectral graph theory has been used in the literature of clustering concepts and how it helps to predict relationships that have not yet been identified in the existing literature. Some application areas on the graph clustering algorithms are discussed. This survey outlines the problems addressed by the existing research works on spectral clustering with its problems, methodologies, data sets and advantages. This paper identifies fundamental issues of graph clustering which provides a better direction for more applications in social network analysis, image segmentation, computer vision and other domains.



Author(s):  
Yun Xiao ◽  
Pengzhen Ren ◽  
Zhihui Li ◽  
Xiaojiang Chen ◽  
Xin Wang ◽  
...  

Spectral clustering has been widely adopted because it can mine structures between data clusters. The clustering performance of spectral clustering depends largely on the quality of the constructed affinity graph, especially when the data has noise. Subspace learning can transform the original input features to a low-dimensional subspace and help to produce a robust method. Therefore, how to learn an intrinsic subspace and construct a pure affinity graph on a dataset with noise is a challenge in spectral clustering. In order to deal with this challenge, a new Robust Single-Step Spectral Clustering with Intrinsic Subspace (RS3CIS) method is proposed in this paper. RS3CIS uses a local representation method that projects the original data into a low-dimensional subspace through a row-sparse transformation matrix and uses the `2,1-norm of the transformation matrix as a penalty term to achieve noise suppression. In addition, RS3CIS introduces Laplacian matrix rank constraint so that it can output an affinity graph with an explicit clustering structure, which makes the final clustering result to be obtained in a single-step of constructing an affinity matrix. One synthetic dataset and six real benchmark datasets are used to verify the performance of the proposed method by performing clustering and projection experiments. Experimental results show that RS3CIS outperforms the related methods with respect to clustering quality, robustness and dimension reduction.



2014 ◽  
Vol 24 (3) ◽  
pp. 651-662
Author(s):  
Feng ZENG ◽  
Tong YANG ◽  
Shan YAO


Sign in / Sign up

Export Citation Format

Share Document