Spectral Clustering for Large-Scale Social Networks via a Pre-Coarsening Sampling based NystrÖm Method

Author(s):  
Ying Kang ◽  
Bo Yu ◽  
Weiping Wang ◽  
Dan Meng
2018 ◽  
Vol 129 ◽  
pp. 9-15 ◽  
Author(s):  
Liangchi Li ◽  
Shenling Wang ◽  
Shuaijing Xu ◽  
Yuqi Yang

2009 ◽  
Vol 21 (1) ◽  
pp. 121-146 ◽  
Author(s):  
Kai Zhang ◽  
James T. Kwok

The Nyström method is a well-known sampling-based technique for approximating the eigensystem of large kernel matrices. However, the chosen samples in the Nyström method are all assumed to be of equal importance, which deviates from the integral equation that defines the kernel eigenfunctions. Motivated by this observation, we extend the Nyström method to a more general, density-weighted version. We show that by introducing the probability density function as a natural weighting scheme, the approximation of the eigensystem can be greatly improved. An efficient algorithm is proposed to enforce such weighting in practice, which has the same complexity as the original Nyström method and hence is notably cheaper than several other alternatives. Experiments on kernel principal component analysis, spectral clustering, and image segmentation demonstrate the encouraging performance of our algorithm.


2021 ◽  
Vol 2021 ◽  
pp. 1-11
Author(s):  
Ling Wang ◽  
Hongqiao Wang ◽  
Guangyuan Fu

Extensions of kernel methods for the class imbalance problems have been extensively studied. Although they work well in coping with nonlinear problems, the high computation and memory costs severely limit their application to real-world imbalanced tasks. The Nyström method is an effective technique to scale kernel methods. However, the standard Nyström method needs to sample a sufficiently large number of landmark points to ensure an accurate approximation, which seriously affects its efficiency. In this study, we propose a multi-Nyström method based on mixtures of Nyström approximations to avoid the explosion of subkernel matrix, whereas the optimization to mixture weights is embedded into the model training process by multiple kernel learning (MKL) algorithms to yield more accurate low-rank approximation. Moreover, we select subsets of landmark points according to the imbalance distribution to reduce the model’s sensitivity to skewness. We also provide a kernel stability analysis of our method and show that the model solution error is bounded by weighted approximate errors, which can help us improve the learning process. Extensive experiments on several large scale datasets show that our method can achieve a higher classification accuracy and a dramatical speedup of MKL algorithms.


Author(s):  
Anna Choromanska ◽  
Tony Jebara ◽  
Hyungtae Kim ◽  
Mahesh Mohan ◽  
Claire Monteleoni

2021 ◽  
Vol 13 (3) ◽  
pp. 355
Author(s):  
Weixian Tan ◽  
Borong Sun ◽  
Chenyu Xiao ◽  
Pingping Huang ◽  
Wei Xu ◽  
...  

Classification based on polarimetric synthetic aperture radar (PolSAR) images is an emerging technology, and recent years have seen the introduction of various classification methods that have been proven to be effective to identify typical features of many terrain types. Among the many regions of the study, the Hunshandake Sandy Land in Inner Mongolia, China stands out for its vast area of sandy land, variety of ground objects, and intricate structure, with more irregular characteristics than conventional land cover. Accounting for the particular surface features of the Hunshandake Sandy Land, an unsupervised classification method based on new decomposition and large-scale spectral clustering with superpixels (ND-LSC) is proposed in this study. Firstly, the polarization scattering parameters are extracted through a new decomposition, rather than other decomposition approaches, which gives rise to more accurate feature vector estimate. Secondly, a large-scale spectral clustering is applied as appropriate to meet the massive land and complex terrain. More specifically, this involves a beginning sub-step of superpixels generation via the Adaptive Simple Linear Iterative Clustering (ASLIC) algorithm when the feature vector combined with the spatial coordinate information are employed as input, and subsequently a sub-step of representative points selection as well as bipartite graph formation, followed by the spectral clustering algorithm to complete the classification task. Finally, testing and analysis are conducted on the RADARSAT-2 fully PolSAR dataset acquired over the Hunshandake Sandy Land in 2016. Both qualitative and quantitative experiments compared with several classification methods are conducted to show that proposed method can significantly improve performance on classification.


2021 ◽  
Vol 5 (1) ◽  
pp. 14
Author(s):  
Christos Makris ◽  
Georgios Pispirigos

Nowadays, due to the extensive use of information networks in a broad range of fields, e.g., bio-informatics, sociology, digital marketing, computer science, etc., graph theory applications have attracted significant scientific interest. Due to its apparent abstraction, community detection has become one of the most thoroughly studied graph partitioning problems. However, the existing algorithms principally propose iterative solutions of high polynomial order that repetitively require exhaustive analysis. These methods can undoubtedly be considered resource-wise overdemanding, unscalable, and inapplicable in big data graphs, such as today’s social networks. In this article, a novel, near-linear, and highly scalable community prediction methodology is introduced. Specifically, using a distributed, stacking-based model, which is built on plain network topology characteristics of bootstrap sampled subgraphs, the underlined community hierarchy of any given social network is efficiently extracted in spite of its size and density. The effectiveness of the proposed methodology has diligently been examined on numerous real-life social networks and proven superior to various similar approaches in terms of performance, stability, and accuracy.


Sign in / Sign up

Export Citation Format

Share Document