scholarly journals Spectral clustering and the high-dimensional stochastic blockmodel

2011 ◽  
Vol 39 (4) ◽  
pp. 1878-1915 ◽  
Author(s):  
Karl Rohe ◽  
Sourav Chatterjee ◽  
Bin Yu
Author(s):  
Clinton Morris ◽  
Carolyn C. Seepersad

Design space exploration can reveal the underlying structure of design problems of interest. In a set-based approach, for example, exploration can identify sets of designs or regions of the design space that meet specific performance requirements. For some problems, promising designs may cluster in multiple regions of the design space, and the boundaries of those clusters may be irregularly shaped and difficult to predict. Visualizing the promising regions can clarify the design space structure, but design spaces are typically high-dimensional, making it difficult to visualize the space in three dimensions. Techniques have been introduced to map high-dimensional design spaces to low-dimensional, visualizable spaces. Before the promising regions can be visualized, however, the first task is to identify how many clusters of promising designs exist in the high-dimensional design space. Unsupervised machine learning methods, such as spectral clustering, have been utilized for this task. Spectral clustering is generally accurate but becomes computationally intractable with large sets of candidate designs. Therefore, in this paper a technique for accurately identifying clusters of promising designs is introduced that remains viable with large sets of designs. The technique is based on spectral clustering but reduces its computational impact by leveraging the Nyström Method in the formulation of self-tuning spectral clustering. After validating the method on a simplified example, it is applied to identify clusters of high performance designs for a high-dimensional negative stiffness metamaterials design problem.


Author(s):  
Pushpalatha R. ◽  
K. Meenakshi Sundaram

<p>Data mining is an essential process for identifying the patterns in large datasets through machine learning techniques and database systems. Clustering of high dimensional data is becoming very challenging process due to curse of dimensionality. In addition, space complexity and data retrieval performance was not improved. In order to overcome the limitation, Spectral Clustering Based VP Tree Indexing Technique is introduced. The technique clusters and indexes the densely populated high dimensional data points for effective data retrieval based on user query. A Normalized Spectral Clustering Algorithm is used to group similar high dimensional data points. After that, Vantage Point Tree is constructed for indexing the clustered data points with minimum space complexity. At last, indexed data gets retrieved based on user query using Vantage Point Tree based Data Retrieval Algorithm.  This in turn helps to improve true positive rate with minimum retrieval time. The performance is measured in terms of space complexity, true positive rate and data retrieval time with El Nino weather data sets from UCI Machine Learning Repository. An experimental result shows that the proposed technique is able to reduce the space complexity by 33% and also reduces the data retrieval time by 24% when compared to state-of-the-art-works.</p>


2018 ◽  
Vol 27 (05) ◽  
pp. 1850020 ◽  
Author(s):  
Cong-Zhe You ◽  
Vasile Palade ◽  
Xiao-Jun Wu

Subspace clustering analysis algorithms are often employed when dealing with high-dimensional data. As a representative approach, Low-Rank Representation (LRR) of data has achieved great success for subspace segmentation tasks in applications such as image processing. The traditional LRR-related methods consist of two separate tasks: first, the affinity graph construction by using lowrank minimization techniques, and then the spectral clustering, which is done on the affinity graph to get the final segmentation. Since these two steps are independent of each other, this method does not guarantee that the results obtained by the algorithm are globally optimal. In this paper, a method called Robust Structured Low-Rank Representation (RSLRR) is proposed, by integrating the two above mentioned tasks and solve a joint optimization problem. This paper also puts forward a method to solve the joint optimization problem, which can efficiently get both the segmentation and the structured low-rank representation. Experiments on several standard datasets show that, compared with other algorithms, the algorithm proposed in this paper can achieve better clustering results.


Sign in / Sign up

Export Citation Format

Share Document