scholarly journals An Optimization of Closed Frequent Subgraph Mining Algorithm

2017 ◽  
Vol 17 (1) ◽  
pp. 3-15
Author(s):  
J. Demetrovics ◽  
H. M. Quang ◽  
N. V. Anh ◽  
V. D. Thi

Abstract Graph mining isamajor area of interest within the field of data mining in recent years. Akey aspect of graph mining is frequent subgraph mining. Central to the entire discipline of frequent subgraph mining is the concept of subgraph isomorphism. One major issue in early subgraph isomorphism research concerns computational complexity. Normally, the subgraph isomorphism problem is NP-complete. Previous studies of frequent subgraph mining have not solved NP-complete problem in the subgraph isomorphism. In this paper, we proposeanew algorithm which can deal with this problem. The proposed algorithm can solve the subgraph isomorphism in polynomial time in some settings. Moreover, the new algorithm is proved theoretically more effective than previous studies in closed frequent subgraph mining.

2018 ◽  
Vol 8 (1) ◽  
pp. 194-209 ◽  
Author(s):  
Büsra Güvenoglu ◽  
Belgin Ergenç Bostanoglu

AbstractData mining is a popular research area that has been studied by many researchers and focuses on finding unforeseen and important information in large databases. One of the popular data structures used to represent large heterogeneous data in the field of data mining is graphs. So, graph mining is one of the most popular subdivisions of data mining. Subgraphs that are more frequently encountered than the user-defined threshold in a database are called frequent subgraphs. Frequent subgraphs in a database can give important information about this database. Using this information, data can be classified, clustered and indexed. The purpose of this survey is to examine frequent subgraph mining algorithms (i) in terms of frequent subgraph discovery process phases such as candidate generation and frequency calculation, (ii) categorize the algorithms according to their general attributes such as input type, dynamicity of graphs, result type, algorithmic approach they are based on, algorithmic design and graph representation as well as (iii) to discuss the performance of algorithms in comparison to each other and the challenges faced by the algorithms recently.


2021 ◽  
pp. 1-10
Author(s):  
Aamir Ali ◽  
Muhammad Asim

Generally, big interaction networks keep the interaction records of actors over a certain period. With the rapid increase of these networks users, the demand for frequent subgraph mining on a large database is more and more intense. However, most of the existing studies of frequent subgraphs have not considered the temporal information of the graph. To fill this research gap, this article presents a novel temporal frequent subgraph-based mining algorithm (TFSBMA) using spark. TFSBMA employs frequent subgraph mining with a minimum threshold in a spark environment. The proposed algorithm attempts to analyze the temporal frequent subgraph (TFS) using a Frequent Subgraph Mining Based Using Spark (FSMBUS) method with a minimum support threshold and evaluate its frequency in temporal manner. Furthermore, based on the FSMBUS results, the study also tries to compute TFS using an incremental update strategy. Experimental results show that the proposed algorithm can accurately and efficiently compute all the TFS with corresponding frequencies. In addition, we applied the proposed algorithm on a real-world dataset having artificial time information that confirms the practical usability of the proposed algorithm.


2014 ◽  
Vol 23 (05) ◽  
pp. 1450005 ◽  
Author(s):  
Brahim Douar ◽  
Chiraz Latiri ◽  
Michel Liquiere ◽  
Yahya Slimani

The aim of the frequent subgraph mining task is to find frequently occurring subgraphs in a large graph database. However, this task is a thriving challenge, as graph and subgraph isomorphisms play a key role throughout the computations. Since subgraph isomorphism testing is a hard problem, subgraph miners are exponential in runtime. To alleviate the complexity issue, we propose to introduce a bias in the projection operator and instead of using the costly subgraph isomorphism projection, one can use a polynomial projection having a semantically-valid structural interpretation. This paper presents a new projection operator for graphs named AC-projection, which exhibits nice theoretical complexity properties. We study the size of the search space as well as some practical properties of the projection operator. We also introduce a novel breadth-first algorithm for frequent AC-reduced subgraphs mining. Then, we prove experimentally that we can achieve an important performance gain (polynomial complexity projection) without or with non-significant loss of discovered patterns in terms of quality.


To overcome the challenges for managing the rapid growth of social graphs, massive Distributed Graph Mining Systems are developed, such as Pregel, GiraphHama, GraphLab, PowerLab, etc. The common approach to all systems is to divide the entire Graph Dataset into smaller divisions and use it as “think like a vertex”, the programing model is to hold up a continual graph calculation. In this paper, we use the Optimized Frequent Subgraph Mining algorithm in the Giraph framework model and make a comparative study with existing different Distributed Systems. To enhance the flexibility and performance of the novel method, we carry out different optimization techniques associating it with updating different run time limits. We also investigate how the performance could be improved by Giraph Distribution System, which plays a vital role in social graphs such as LinkedIn, Twitter, Facebook, etc. The graph input, output, cluster set up and hardware configuration play vital roles in optimizing the performance of our proposed algorithm.


2014 ◽  
Vol 2014 ◽  
pp. 1-6 ◽  
Author(s):  
Saif Ur Rehman ◽  
Sohail Asghar ◽  
Yan Zhuang ◽  
Simon Fong

Due to rapid development of the Internet technology and new scientific advances, the number of applications that model the data as graphs increases, because graphs have highly expressive power to model a complicated structure. Graph mining is a well-explored area of research which is gaining popularity in the data mining community. A graph is a general model to represent data and has been used in many domains such as cheminformatics, web information management system, computer network, and bioinformatics, to name a few. In graph mining the frequent subgraph discovery is a challenging task. Frequent subgraph mining is concerned with discovery of those subgraphs from graph dataset which have frequent or multiple instances within the given graph dataset. In the literature a large number of frequent subgraph mining algorithms have been proposed; these included FSG, AGM, gSpan, CloseGraph, SPIN, Gaston, and Mofa. The objective of this research work is to perform quantitative comparison of the above listed techniques. The performances of these techniques have been evaluated through a number of experiments based on three different state-of-the-art graph datasets. This novel work will provide base for anyone who is working to design a new frequent subgraph discovery technique.


2012 ◽  
Vol 28 (1) ◽  
pp. 75-105 ◽  
Author(s):  
Chuntao Jiang ◽  
Frans Coenen ◽  
Michele Zito

AbstractGraph mining is an important research area within the domain of data mining. The field of study concentrates on the identification of frequent subgraphs within graph data sets. The research goals are directed at: (i) effective mechanisms for generating candidate subgraphs (without generating duplicates) and (ii) how best to process the generated candidate subgraphs so as to identify the desired frequent subgraphs in a way that is computationally efficient and procedurally effective. This paper presents a survey of current research in the field of frequent subgraph mining and proposes solutions to address the main research issues.


Sign in / Sign up

Export Citation Format

Share Document