A Comparative Study of Graph Kernels and Clustering Algorithms

Author(s):  
Riju Bhattacharya ◽  
Naresh Kumar Nagwani ◽  
Sarsij Tripathi

Graph kernels have evolved as a promising and popular method for graph clustering over the last decade. In this work, comparative study on the five standard graph kernel techniques for graph clustering has been performed. The graph kernels, namely vertex histogram kernel, shortest path kernel, graphlet kernel, k-step random walk kernel, and Weisfeiler-Lehman kernel have been compared for graph clustering. The clustering methods considered for the kernel comparison are hierarchical, k-means, model-based, fuzzy-based, and self-organizing map clustering techniques. The comparative study of kernel methods over the clustering techniques is performed on MUTAG benchmark dataset. Clustering performance is assessed with internal validation performance parameters such as connectivity, Dunn, and the silhouette index. Finally, the comparative analysis is done to facilitate researchers for selecting the appropriate kernel method for effective graph clustering. The proposed methodology elicits k-step random walk and shortest path kernel have performed best among all graph clustering approaches.

Author(s):  
JIANLIN ZHU ◽  
JIN HUANG ◽  
DAICUI ZHOU ◽  
ZHONGBAO YIN ◽  
GUOPING ZHANG ◽  
...  

Software architecture recovery is to gain the architectural level understanding of a software system while its architecture description does not exist. In recent years, researchers have adopted various software clustering techniques to detect hierarchical structure of software systems. Most graph clustering techniques focus on the connectivity between program elements, but unreasonably ignore the similarity which is also a key measure for finding elements of one module. In this paper we propose a novel hierarchy graph clustering algorithm DGHC, which considers both similarity and connectivity between program elements. During the transformation of program dependence graph edges representing similarity between elements are added. Then similar elements are grouped by density-based approaches. The alternative strategy is adopted to find groups of closely connected and similar elements. Meanwhile we adjust the contribution of connectivity and similarity by a flexible clustering algorithm based on short random walk model, which can obtain more structure information of software to find its multiple layers. Furthermore a new method called Multi-layer Propagation Gap is proposed to suggest stable layers of hierarchy clustering result as multiple layers of software system. Extensive experimental results illustrate the effectiveness and efficiency of DGHC in detecting hierarchy structure of software through comparison with various software clustering methods.


Author(s):  
Gyanendra Mohan Patel ◽  
Anupam Singh ◽  
Tanishka Bhala ◽  
Aryaman Jora ◽  
Divyansh Chandna

2021 ◽  
Vol 11 (1) ◽  
Author(s):  
Salvatore Citraro ◽  
Giulio Rossetti

AbstractGrouping well-connected nodes that also result in label-homogeneous clusters is a task often known as attribute-aware community discovery. While approaching node-enriched graph clustering methods, rigorous tools need to be developed for evaluating the quality of the resulting partitions. In this work, we present X-Mark, a model that generates synthetic node-attributed graphs with planted communities. Its novelty consists in forming communities and node labels contextually while handling categorical or continuous attributive information. Moreover, we propose a comparison between attribute-aware algorithms, testing them against our benchmark. Accordingly to different classification schema from recent state-of-the-art surveys, our results suggest that X-Mark can shed light on the differences between several families of algorithms.


Author(s):  
Maulida Ayu Fitriani ◽  
Aina Musdholifah ◽  
Sri Hartati

Various clustering methods to obtain optimal information continues to evolve one of its development is Evolutionary Algorithm (EA). Adaptive Unified Differential Evolution (AuDE), is the development of Differential Evolution (DE) which is one of the EA techniques. AuDE has self adaptive scale factor control parameters (F) and crossover-rate (Cr).. It also has a single mutation strategy that represents the most commonly used standard mutation strategies from previous studies.The AuDE clustering method was tested using 4 datasets. Silhouette Index and CS Measure is a fitness function used as a measure of the quality of clustering results. The quality of the AuDE clustering results is then compared against the quality of clustering results using the DE method.The results show that the AuDE mutation strategy can expand the cluster central search produced by ED so that better clustering quality can be obtained. The comparison of the quality of AuDE and DE using Silhoutte Index is 1:0.816, whereas the use of CS Measure shows a comparison of 0.565:1. The execution time required AuDE shows better but Number significant results, aimed at the comparison of Silhoutte Index usage of 0.99:1 , Whereas on the use of CS Measure obtained the comparison of 0.184:1.


Author(s):  
Yan Zheng ◽  
Xiaochun Cheng ◽  
Ronghuai Huang ◽  
Yi Man

Author(s):  
B.K. Tripathy ◽  
Adhir Ghosh

Developing Data Clustering algorithms have been pursued by researchers since the introduction of k-means algorithm (Macqueen 1967; Lloyd 1982). These algorithms were subsequently modified to handle categorical data. In order to handle the situations where objects can have memberships in multiple clusters, fuzzy clustering and rough clustering methods were introduced (Lingras et al 2003, 2004a). There are many extensions of these initial algorithms (Lingras et al 2004b; Lingras 2007; Mitra 2004; Peters 2006, 2007). The MMR algorithm (Parmar et al 2007), its extensions (Tripathy et al 2009, 2011a, 2011b) and the MADE algorithm (Herawan et al 2010) use rough set techniques for clustering. In this chapter, the authors focus on rough set based clustering algorithms and provide a comparative study of all the fuzzy set based and rough set based clustering algorithms in terms of their efficiency. They also present problems for future studies in the direction of the topics covered.


Sign in / Sign up

Export Citation Format

Share Document