Fast Subgraph Matching on Large Graphs using Graphics Processors

Privacy Preserving Subgraph Matching on Large Graphs in Cloud

Proceedings of the 2016 International Conference on Management of Data - SIGMOD '16 ◽

10.1145/2882903.2882956 ◽

2016 ◽

Cited By ~ 9

Author(s):

Zhao Chang ◽

Lei Zou ◽

Feifei Li

Keyword(s):

Privacy Preserving ◽

Large Graphs ◽

Subgraph Matching

Download Full-text

SASUM: A Sharing-Based Approach to Fast Approximate Subgraph Matching for Large Graphs

IEICE Transactions on Information and Systems ◽

10.1587/transinf.e96.d.624 ◽

2013 ◽

Vol E96.D (3) ◽

pp. 624-633

Author(s):

Song-Hyon KIM ◽

Inchul SONG ◽

Kyong-Ha LEE ◽

Yoon-Joon LEE

Keyword(s):

Large Graphs ◽

Subgraph Matching

Download Full-text

PBSM: An Efficient Top-K Subgraph Matching Algorithm

International Journal of Pattern Recognition and Artificial Intelligence ◽

10.1142/s0218001418500209 ◽

2018 ◽

Vol 32 (06) ◽

pp. 1850020

Author(s):

Wei Chen ◽

Jia Liu ◽

Ziyang Chen ◽

Xian Tang ◽

Kaiyu Li

Keyword(s):

Large Graphs ◽

Research Issues ◽

Matching Algorithm ◽

Query Graph ◽

Subgraph Matching ◽

Minimum Number ◽

Overall Performance ◽

Data Graph ◽

Memory Cost ◽

Graph Data Management

Top-K subgraph matching is one of the hot research issues in graph data management, which is to find, from the data graph, K subgraphs isomorphic to the query graph with the largest sum of weights. The existing methods of Top-K subgraph matching on large graphs usually use the filter-and-verify strategy. However, they all suffer from inefficiency in both stages. In the filtering stage, there exists repeated enumeration of vertices and the excessive memory cost of the filtering. In the verification stage, there exists redundant verification. Regarding to the above problems, we propose to use the preprocessing of the graph compression based on equivalent vertices to reduce the enumeration. In the filtering stage, we propose to reduce the memory cost by only considering the direct neighbors. In the verification stage, we take the vertex with the minimum number of candidate vertices in the query graph as the start vertex of the matching order, and use the idea of Ranking While Matching (RWM) to terminate the execution of the algorithm as early as possible by estimating the upper bound of the weights, so as to reduce redundant verification and improve the overall performance. Finally, the experimental results show that our method is much more efficient than existing methods in compression and the processing time.

Download Full-text

Efficient Techniques for Graph Searching and Biological Network Mining

Advances in Data Mining and Database Management - Graph Data Management ◽

10.4018/978-1-61350-053-8.ch005 ◽

2011 ◽

pp. 89-111

Author(s):

Alfredo Ferro ◽

Rosalba Giugno ◽

Alfredo Pulvirenti ◽

Dennis Shasha

Keyword(s):

Social Networks ◽

Computational Complexity ◽

Biological Network ◽

Graph Searching ◽

Search Methods ◽

Large Graphs ◽

Network Mining ◽

Subgraph Matching ◽

Key Concepts ◽

Np Complete

From biochemical applications to social networks, graphs represent data. Comparing graphs or searching for motifs on such data often reveals interesting and useful patterns. Most of the problems on graphs are known to be NP-complete. Because of the computational complexity of subgraph matching, reducing the candidate graphs or restricting the space in which to search for motifs is critical to achieving efficiency. Therefore, to optimize and engineer isomorphism algorithms, design indexing and suitable search methods for large graphs are the main directions investigated in the graph searching area. This chapter focuses on the key concepts underlying the existing algorithms. First it reviews the most known used algorithms to compare two algorithms and then it describes the algorithms to search on large graphs making emphasis on their application on biological area.

Download Full-text

Matrix-Free Finite-Element Computations On Graphics Processors with Adaptively Refined Unstructured Meshes

25th High Performance Computing Symposium (HPC 2017) ◽

10.22360/springsim.2017.hpc.004 ◽

2017 ◽

Keyword(s):

Finite Element ◽

Unstructured Meshes ◽

Graphics Processors ◽

Matrix Free

Download Full-text

Evaluating the potential of graphics processors for high performance embedded computing

2011 Design, Automation & Test in Europe ◽

10.1109/date.2011.5763120 ◽

2011 ◽

Author(s):

Shuai Mu ◽

Chenxi Wang ◽

Ming Liu ◽

Dongdong Li ◽

Maohua Zhu ◽

...

Keyword(s):

High Performance ◽

Graphics Processors ◽

Embedded Computing

Download Full-text

The Laplacian Spectrum of Large Graphs Sampled from Graphons

IEEE Transactions on Network Science and Engineering ◽

10.1109/tnse.2021.3069675 ◽

2021 ◽

pp. 1-1

Author(s):

Renato Vizuete ◽

Federica Garin ◽

Paolo Frasca

Keyword(s):

Laplacian Spectrum ◽

Large Graphs

Download Full-text

VColor*: a practical approach for coloring large graphs

Frontiers of Computer Science ◽

10.1007/s11704-020-9205-y ◽

2021 ◽

Vol 15 (4) ◽

Author(s):

Yun Peng ◽

Xin Lin ◽

Byron Choi ◽

Bingsheng He

Keyword(s):

Practical Approach ◽

Large Graphs

Download Full-text

Faster Motif Counting via Succinct Color Coding and Adaptive Sampling

ACM Transactions on Knowledge Discovery from Data ◽

10.1145/3447397 ◽

2021 ◽

Vol 15 (6) ◽

pp. 1-27

Author(s):

Marco Bressan ◽

Stefano Leucci ◽

Alessandro Panconesi

Keyword(s):

Adaptive Sampling ◽

Relative Frequency ◽

State Of The Art ◽

Color Coding ◽

Input Graph ◽

Large Graphs ◽

Running Time ◽

Uniform Sampling ◽

Current State ◽

Connected Subgraphs

We address the problem of computing the distribution of induced connected subgraphs, aka graphlets or motifs , in large graphs. The current state-of-the-art algorithms estimate the motif counts via uniform sampling by leveraging the color coding technique by Alon, Yuster, and Zwick. In this work, we extend the applicability of this approach by introducing a set of algorithmic optimizations and techniques that reduce the running time and space usage of color coding and improve the accuracy of the counts. To this end, we first show how to optimize color coding to efficiently build a compact table of a representative subsample of all graphlets in the input graph. For 8-node motifs, we can build such a table in one hour for a graph with 65M nodes and 1.8B edges, which is times larger than the state of the art. We then introduce a novel adaptive sampling scheme that breaks the “additive error barrier” of uniform sampling, guaranteeing multiplicative approximations instead of just additive ones. This allows us to count not only the most frequent motifs, but also extremely rare ones. For instance, on one graph we accurately count nearly 10.000 distinct 8-node motifs whose relative frequency is so small that uniform sampling would literally take centuries to find them. Our results show that color coding is still the most promising approach to scalable motif counting.

Download Full-text

Tiered Sampling

ACM Transactions on Knowledge Discovery from Data ◽

10.1145/3441299 ◽

2021 ◽

Vol 15 (5) ◽

pp. 1-52

Author(s):

Lorenzo De Stefani ◽

Erisa Terolli ◽

Eli Upfal

Keyword(s):

Large Scale ◽

Analysis Of Algorithms ◽

Base Layer ◽

Single Edge ◽

Real World Data ◽

High Quality ◽

Large Graphs ◽

Massive Graphs ◽

Variance Estimate ◽

Low Probability

We introduce Tiered Sampling , a novel technique for estimating the count of sparse motifs in massive graphs whose edges are observed in a stream. Our technique requires only a single pass on the data and uses a memory of fixed size M , which can be magnitudes smaller than the number of edges. Our methods address the challenging task of counting sparse motifs—sub-graph patterns—that have a low probability of appearing in a sample of M edges in the graph, which is the maximum amount of data available to the algorithms in each step. To obtain an unbiased and low variance estimate of the count, we partition the available memory into tiers (layers) of reservoir samples. While the base layer is a standard reservoir sample of edges, other layers are reservoir samples of sub-structures of the desired motif. By storing more frequent sub-structures of the motif, we increase the probability of detecting an occurrence of the sparse motif we are counting, thus decreasing the variance and error of the estimate. While we focus on the designing and analysis of algorithms for counting 4-cliques, we present a method which allows generalizing Tiered Sampling to obtain high-quality estimates for the number of occurrence of any sub-graph of interest, while reducing the analysis effort due to specific properties of the pattern of interest. We present a complete analytical analysis and extensive experimental evaluation of our proposed method using both synthetic and real-world data. Our results demonstrate the advantage of our method in obtaining high-quality approximations for the number of 4 and 5-cliques for large graphs using a very limited amount of memory, significantly outperforming the single edge sample approach for counting sparse motifs in large scale graphs.

Download Full-text