frequent subgraphs Latest Research Papers

A New Approach for Mining Correlated Frequent Subgraphs

ACM Transactions on Management Information Systems ◽

10.1145/3473042 ◽

2022 ◽

Vol 13 (1) ◽

pp. 1-28

Author(s):

Mohammad Ehsan Shahmi Chowdhury ◽

Chowdhury Farhan Ahmed ◽

Carson K. Leung

Keyword(s):

Correlation Analysis ◽

Real World ◽

Graph Mining ◽

Real Life ◽

Small Subset ◽

Method Evaluation ◽

New Approach ◽

Frequent Subgraph ◽

Frequent Subgraphs ◽

Complete Framework

Nowadays graphical datasets are having a vast amount of applications. As a result, graph mining—mining graph datasets to extract frequent subgraphs—has proven to be crucial in numerous aspects. It is important to perform correlation analysis among the subparts (i.e., elements) of the frequent subgraphs generated using graph mining to observe interesting information. However, the majority of existing works focuses on complexities in dealing with graphical structures, and not much work aims to perform correlation analysis. For instance, a previous work realized in this regard, operated with a very naive raw approach to fulfill the objective, but dealt only on a small subset of the problem. Hence, in this article, a new measure is proposed to aid in the analysis for large subgraphs, mined from various types of graph transactions in the dataset. These subgraphs are immense in terms of their structural composition, and thus parallel the entire set of graphs in real-world. A complete framework for discovering the relations among parts of a frequent subgraph is proposed using our new method. Evaluation results show the usefulness and accuracy of the newly defined measure on real-life graphical datasets.

Download Full-text

RASMA: a reverse search algorithm for mining maximal frequent subgraphs

BioData Mining ◽

10.1186/s13040-021-00250-1 ◽

2021 ◽

Vol 14 (1) ◽

Author(s):

Saeed Salem ◽

Mohammed Alokshiya ◽

Mohammad Al Hasan

Keyword(s):

Gene Expression ◽

Search Algorithm ◽

Enrichment Analysis ◽

Research Problem ◽

Disease Classification ◽

Connected Subgraph ◽

Gene Coexpression ◽

Key Innovation ◽

Frequent Subgraphs ◽

Coexpression Networks

Abstract Background Given a collection of coexpression networks over a set of genes, identifying subnetworks that appear frequently is an important research problem known as mining frequent subgraphs. Maximal frequent subgraphs are a representative set of frequent subgraphs; A frequent subgraph is maximal if it does not have a super-graph that is frequent. In the bioinformatics discipline, methodologies for mining frequent and/or maximal frequent subgraphs can be used to discover interesting network motifs that elucidate complex interactions among genes, reflected through the edges of the frequent subnetworks. Further study of frequent coexpression subnetworks enhances the discovery of biological modules and biological signatures for gene expression and disease classification. Results We propose a reverse search algorithm, called RASMA, for mining frequent and maximal frequent subgraphs in a given collection of graphs. A key innovation in RASMA is a connected subgraph enumerator that uses a reverse-search strategy to enumerate connected subgraphs of an undirected graph. Using this enumeration strategy, RASMA obtains all maximal frequent subgraphs very efficiently. To overcome the computationally prohibitive task of enumerating all frequent subgraphs while mining for the maximal frequent subgraphs, RASMA employs several pruning strategies that substantially improve its overall runtime performance. Experimental results show that on large gene coexpression networks, the proposed algorithm efficiently mines biologically relevant maximal frequent subgraphs. Conclusion Extracting recurrent gene coexpression subnetworks from multiple gene expression experiments enables the discovery of functional modules and subnetwork biomarkers. We have proposed a reverse search algorithm for mining maximal frequent subnetworks. Enrichment analysis of the extracted maximal frequent subnetworks reveals that subnetworks that are frequent are highly enriched with known biological ontologies.

Download Full-text

K-Graph: Knowledgeable Graph for Text Documents

Journal of Konbin ◽

10.2478/jok-2021-0006 ◽

2021 ◽

Vol 51 (1) ◽

pp. 73-89

Author(s):

Varsha Mittal ◽

Durgaprasad Gangodkar ◽

Bhaskar Pant

Keyword(s):

Structural Information ◽

Low Complexity ◽

Semantic Knowledge ◽

Access Time ◽

Graph Databases ◽

Text Documents ◽

Memory Overhead ◽

Frequent Subgraphs ◽

Real World Datasets ◽

Different Levels

Abstract Graph databases are applied in many applications, including science and business, due to their low-complexity, low-overheads, and lower time-complexity. The graph-based storage offers the advantage of capturing the semantic and structural information rather than simply using the Bag-of-Words technique. An approach called Knowledgeable graphs (K-Graph) is proposed to capture semantic knowledge. Documents are stored using graph nodes. Thanks to weighted subgraphs, the frequent subgraphs are extracted and stored in the Fast Embedding Referral Table (FERT). The table is maintained at different levels according to the headings and subheadings of the documents. It reduces the memory overhead, retrieval, and access time of the subgraph needed. The authors propose an approach that will reduce the data redundancy to a larger extent. With real-world datasets, K-graph’s performance and power usage are threefold greater than the current methods. Ninety-nine per cent accuracy demonstrates the robustness of the proposed algorithm.

Download Full-text

RASMA: A Reverse Search Algorithm for Mining Maximal Frequent Subgraphs

10.21203/rs.3.rs-46148/v3 ◽

2021 ◽

Author(s):

Saeed Salem ◽

Mohammed Alokshiya ◽

Mohammad Al Hasan

Keyword(s):

Gene Expression ◽

Search Algorithm ◽

Enrichment Analysis ◽

Research Problem ◽

Disease Classification ◽

Connected Subgraph ◽

Gene Coexpression ◽

Key Innovation ◽

Frequent Subgraphs ◽

Coexpression Networks

Abstract Background: Given a collection of coexpression networks over a set of genes, identifying subnetworks that appear frequently is an important research problem known as mining frequent subgraphs. Maximal frequent subgraphs is a representative set of frequent subgraphs; A frequent subgraph is maximal if it does have a super-graph that is frequent. In the bioinformatics discipline, methodologies for mining frequent and/or maximal frequent subgraphs can be used to discover interesting network motifs that elucidate complex interactions among genes, reflected through the edges of the frequent subnetworks. Further study of frequent coexpression subnetworks enhances the discovery of biological modules and biological signatures for gene expression and disease classification.Results: We propose a reverse search algorithm, called RASMA, for mining frequent and maximal frequent subgraphs in a given collection of graphs. A key innovation in RASMA is a connected subgraph enumerator that uses a reverse-search strategy to enumerate connected subgraphs of an undirected graph. Using this enumeration strategy, RASMA obtains all maximal frequent subgraphs very efficiently. To overcome the computationally prohibitive task of enumerating all frequent subgraphs while mining for the maximal frequent subgraphs, RASMA employs several pruning strategies that substantially improve its overall runtime performance. Experimental results show that on large gene coexpression networks, the proposed algorithm efficiently mines biologically relevant maximal frequent subgraphs.Conclusion: Extracting recurrent gene coexpression subnetworks from multiple gene expression experiments enables the discovery of functional modules and subnetwork biomarkers. We have proposed a reverse search algorithm for mining maximal frequent subnetworks. Enrichment analysis of the extracted maximal frequent subnetworks reveals that subnetworks that are frequent are highly enriched with known biological ontologies.

Download Full-text

Learning Knowledge Using Frequent Subgraph Mining from Ontology Graph Data

Applied Sciences ◽

10.3390/app11030932 ◽

2021 ◽

Vol 11 (3) ◽

pp. 932

Author(s):

Kwangyon Lee ◽

Haemin Jung ◽

June Seok Hong ◽

Wooju Kim

Keyword(s):

Large Scale ◽

Semantic Information ◽

Frequent Subgraph Mining ◽

Small Unit ◽

Graph Data ◽

Novel Method ◽

Frequent Subgraphs ◽

Rating Prediction ◽

Knowledge Graphs ◽

New Item

In many areas, vast amounts of information are rapidly accumulating in the form of ontology-based knowledge graphs, and the use of information in these forms of knowledge graphs is becoming increasingly important. This study proposes a novel method for efficiently learning frequent subgraphs (i.e., knowledge) from ontology-based graph data. An ontology-based large-scale graph is decomposed into small unit subgraphs, which are used as the unit to calculate the frequency of the subgraph. The frequent subgraphs are extracted through candidate generation and chunking processes. To verify the usefulness of the extracted frequent subgraphs, the methodology was applied to movie rating prediction. Using the frequent subgraphs as user profiles, the graph similarity between the rating graph and new item graph was calculated to predict the rating. The MovieLens dataset was used for the experiment, and a comparison showed that the proposed method outperformed other widely used recommendation methods. This study is meaningful in that it proposed an efficient method for extracting frequent subgraphs while maintaining semantic information and considering scalability in large-scale graphs. Furthermore, the proposed method can provide results that include semantic information to serve as a logical basis for rating prediction or recommendation, which existing methods are unable to provide.

Download Full-text

RASMA: A Reverse Search Algorithm for Mining Maximal Frequent Subgraphs

10.21203/rs.3.rs-46148/v2 ◽

2020 ◽

Author(s):

Saeed Salem ◽

Mohammed Alokshiya ◽

Mohammad Al Hasan

Keyword(s):

Gene Expression ◽

Search Algorithm ◽

Enrichment Analysis ◽

Research Problem ◽

Disease Classification ◽

Connected Subgraph ◽

Gene Coexpression ◽

Key Innovation ◽

Frequent Subgraphs ◽

Coexpression Networks

Abstract Background: Given a collection of coexpression networks over a set of genes, identifying subnetworks that appear frequently is an important research problem known as mining frequent subgraphs. Maximal frequent subgraphs is a representative set of frequent subgraphs; A frequent subgraph is maximal if it does have a super-graph that is frequent. In the bioinformatics discipline, methodologies for mining frequent and/or maximal frequent subgraphs can be used to discover interesting network motifs that elucidate complex interactions among genes, reflected through the edges of the frequent subnetworks. Further study of frequent coexpression subnetworks enhances the discovery of biological modules and biological signatures for gene expression and disease classification. Results: We propose a reverse search algorithm, called RASMA, for mining frequent and maximal frequent subgraphs in a given collection of graphs. A key innovation in RASMA is a connected subgraph enumerator that uses a reverse-search strategy to enumerate connected subgraphs of an undirected graph. Using this enumeration strategy, RASMA obtains all maximal frequent subgraphs very efficiently. To overcome the computationally prohibitive task of enumerating all frequent subgraphs while mining for the maximal frequent subgraphs, RASMA employs several pruning strategies that substantially improve its overall runtime performance. Experimental results show that on large gene coexpression networks, the proposed algorithm efficiently mines biologically relevant maximal frequent subgraphs. Conclusion: Extracting recurrent gene coexpression subnetworks from multiple gene expression experiments enables the discovery of functional modules and subnetwork biomarkers. We have proposed a reverse search algorithm for mining maximal frequent subnetworks. Enrichment analysis of the extracted maximal frequent subnetworks reveals that subnetworks that are frequent are highly enriched with known biological ontologies.

Download Full-text

RASMA: A Reverse Search Algorithm for Mining Frequent Subgraphs

10.21203/rs.3.rs-46148/v1 ◽

2020 ◽

Author(s):

Saeed Salem ◽

Mohammed Alokshiya ◽

Mohammad Al Hasan

Keyword(s):

Gene Expression ◽

Search Algorithm ◽

Enrichment Analysis ◽

Disease Classification ◽

Induced Subgraphs ◽

Biologically Relevant ◽

Large Gene ◽

Biological Ontologies ◽

Gene Coexpression ◽

Frequent Subgraphs

Abstract Background: Mining frequent co-expression networks enables the discovery of interesting network motifs that elucidate important interactions among genes. Such interaction subnetworks have been shown to enhance the discovery of biological modules and subnetwork signatures for gene expression and disease classification. Results: We propose a reverse search algorithm for mining frequent and maximal subgraphs over a collection of graphs. We develop an approach for enumerating connected edge-induced subgraphs of an undirected graph by using a reverse-search algorithm, and then use this enumeration strategy for mining all maximal frequent subgraphs. To overcome the computationally prohibitive task of enumerating all frequent subgraphs while mining for maximal subgraphs, the proposed algorithm employs several pruning strategies, which substantially improve its overall runtime performance. Experimental results show that on large gene coexpression networks, the proposed algorithm efficiently mines biologically relevant maximal frequent subgraphs. Conclusion: Extracting recurrent gene coexpression subnetworks from multiple gene expression experiments enables the discovery of functional modules and subnetwork biomarkers. We have proposed a reverse search algorithm for mining maximal frequent subnetworks. Enrichment analysis of the extracted maximal frequent subnetworks reveals that subnetworks that are more frequent are more likely to be enriched with biological ontologies.

Download Full-text

Social Network Analysis of Passes and Communication Graph in Football by mining Frequent Subgraphs

2020 6th International Conference on Web Research (ICWR) ◽

10.1109/icwr49608.2020.9122303 ◽

2020 ◽

Author(s):

Amir Hossein Ahmadi ◽

Ahmadi Noori ◽

Babak Teimourpour

Keyword(s):

Social Network ◽

Social Network Analysis ◽

Network Analysis ◽

Communication Graph ◽

Frequent Subgraphs

Download Full-text

Mining approximate frequent dense modules from multiple gene expression datasets

10.29007/d87q ◽

2020 ◽

Author(s):

San Ha Seo ◽

Saeed Salem

Keyword(s):

Gene Expression ◽

Gene Annotation ◽

Expression Data ◽

Frequent Subgraph Mining ◽

Multiple Gene ◽

Real Gene ◽

Frequent Subgraph ◽

Frequent Subgraphs ◽

Coexpression Networks ◽

Functional Gene Annotation

Large amount of gene expression data has been collected for various environmental and biological conditions. Extracting co-expression networks that are recurrent in multiple co-expression networks has been shown promising in functional gene annotation and biomarkers discovery. Frequent subgraph mining reports a large number of subnetworks. In this work, we propose to mine approximate dense frequent subgraphs. Our proposed approach reports representative frequent subgraphs that are also dense. Our experiments on real gene coexpression networks show that frequent subgraphs are biologically interesting as evidenced by the large percentage of biologically enriched frequent dense subgraphs.

Download Full-text

A Novel Distributed Approach for Frequent Subgraphs Mining Across Cloud Computing System (DistFsm)

Applied Mathematics & Information Sciences ◽

10.18576/amis/140214 ◽

2020 ◽

Vol 14 (2) ◽

pp. 297-307

Keyword(s):

Cloud Computing ◽

Computing System ◽

Distributed Approach ◽

Cloud Computing System ◽

Frequent Subgraphs

Download Full-text

frequent subgraphs
Recently Published Documents

TOTAL DOCUMENTS

H-INDEX

A New Approach for Mining Correlated Frequent Subgraphs

RASMA: a reverse search algorithm for mining maximal frequent subgraphs

K-Graph: Knowledgeable Graph for Text Documents

RASMA: A Reverse Search Algorithm for Mining Maximal Frequent Subgraphs

Learning Knowledge Using Frequent Subgraph Mining from Ontology Graph Data

RASMA: A Reverse Search Algorithm for Mining Maximal Frequent Subgraphs

RASMA: A Reverse Search Algorithm for Mining Frequent Subgraphs

Social Network Analysis of Passes and Communication Graph in Football by mining Frequent Subgraphs

Mining approximate frequent dense modules from multiple gene expression datasets

A Novel Distributed Approach for Frequent Subgraphs Mining Across Cloud Computing System (DistFsm)

Export Citation Format

frequent subgraphsRecently Published Documents

TOTAL DOCUMENTS

H-INDEX

A New Approach for Mining Correlated Frequent Subgraphs

RASMA: a reverse search algorithm for mining maximal frequent subgraphs

K-Graph: Knowledgeable Graph for Text Documents

RASMA: A Reverse Search Algorithm for Mining Maximal Frequent Subgraphs

Learning Knowledge Using Frequent Subgraph Mining from Ontology Graph Data

RASMA: A Reverse Search Algorithm for Mining Maximal Frequent Subgraphs

RASMA: A Reverse Search Algorithm for Mining Frequent Subgraphs

Social Network Analysis of Passes and Communication Graph in Football by mining Frequent Subgraphs

Mining approximate frequent dense modules from multiple gene expression datasets

A Novel Distributed Approach for Frequent Subgraphs Mining Across Cloud Computing System (DistFsm)

frequent subgraphs
Recently Published Documents