subgraph query
Recently Published Documents


TOTAL DOCUMENTS

32
(FIVE YEARS 13)

H-INDEX

6
(FIVE YEARS 2)

Complexity ◽  
2021 ◽  
Vol 2021 ◽  
pp. 1-18
Author(s):  
Xiaohuan Shan ◽  
Haihai Li ◽  
Chunjie Jia ◽  
Dong Li ◽  
Baoyan Song

Interesting subgraph query aims to find subgraphs that are isomorphic to the given query graph from a data graph and rank the subgraphs according to their interestingness scores. However, the existing subgraph query approaches are inefficient when dealing with large-scale labeled data graph. This is caused by the following problems: (i) the existing work mainly focuses on unweighted query graphs, while ignoring the impact of query constraints on query results. (ii) Excessive number of subgraph candidates or complex joins between nodes in the subgraph candidates reduce the query efficiency. To solve these problems, this paper proposes an intelligent solution. Firstly, an Isotype Structure Graph Compression (ISGC) strategy is proposed to compress similar nodes in a graph to reduce the size of the graph and avoid unnecessary matching. Then, an auxiliary data structure Supergraph Topology Feature Index (STFIndex) is designed to replace the storage of the original data graph and improve the efficiency of an online query. After that, a partition method based on Edge Label Step Value (ELSV) is proposed to partition the index logically. In addition, a novel Top-K interest subgraph query approach is proposed, which consists of the multidimensional filtering (MDF) strategy, upper bound value (UBV) (Size-c) matching, and the optimizational join (QJ) method to filter out as many false subgraph candidates as possible to achieve fast joins. We conduct experiments on real and synthetic datasets. Experimental results show that the average performance of our approach is 1.35 higher than that of the state-of-the-art approaches when the query graph is unweighted, and the average performance of our approach is 2.88 higher than that of the state-of-the-art approaches when the query graph is weighted.


Author(s):  
Hyunjoon Kim ◽  
Yunyoung Choi ◽  
Kunsoo Park ◽  
Xuemin Lin ◽  
Seok-Hee Hong ◽  
...  

2021 ◽  
Vol 46 (2) ◽  
pp. 1-45
Author(s):  
Amine Mhedhbi ◽  
Chathura Kankanamge ◽  
Semih Salihoglu

We study the problem of optimizing one-time and continuous subgraph queries using the new worst-case optimal join plans. Worst-case optimal plans evaluate queries by matching one query vertex at a time using multiway intersections. The core problem in optimizing worst-case optimal plans is to pick an ordering of the query vertices to match. We make two main contributions: 1. A cost-based dynamic programming optimizer for one-time queries that (i) picks efficient query vertex orderings for worst-case optimal plans and (ii) generates hybrid plans that mix traditional binary joins with worst-case optimal style multiway intersections. In addition to our optimizer, we describe an adaptive technique that changes the query vertex orderings of the worst-case optimal subplans during query execution for more efficient query evaluation. The plan space of our one-time optimizer contains plans that are not in the plan spaces based on tree decompositions from prior work. 2. A cost-based greedy optimizer for continuous queries that builds on the delta subgraph query framework. Given a set of continuous queries, our optimizer decomposes these queries into multiple delta subgraph queries, picks a plan for each delta query, and generates a single combined plan that evaluates all of the queries. Our combined plans share computations across operators of the plans for the delta queries if the operators perform the same intersections. To increase the amount of computation shared, we describe an additional optimization that shares partial intersections across operators. Our optimizers use a new cost metric for worst-case optimal plans called intersection-cost . When generating hybrid plans, our dynamic programming optimizer for one-time queries combines intersection-cost with the cost of binary joins. We demonstrate the effectiveness of our plans, adaptive technique, and partial intersection sharing optimization through extensive experiments. Our optimizers are integrated into GraphflowDB.


Author(s):  
Ryan Alweiss ◽  
Chady Ben Hamida ◽  
Xiaoyu He ◽  
Alexander Moreira

Abstract Given a fixed graph H, a real number p ∈ (0, 1) and an infinite Erdös–Rényi graph G ∼ G(∞, p), how many adjacency queries do we have to make to find a copy of H inside G with probability at least 1/2? Determining this number f(H, p) is a variant of the subgraph query problem introduced by Ferber, Krivelevich, Sudakov and Vieira. For every graph H, we improve the trivial upper bound of f(H, p) = O(p−d), where d is the degeneracy of H, by exhibiting an algorithm that finds a copy of H in time O(p−d) as p goes to 0. Furthermore, we prove that there are 2-degenerate graphs which require p−2+o(1) queries, showing for the first time that there exist graphs H for which f(H, p) does not grow like a constant power of p−1 as p goes to 0. Finally, we answer a question of Feige, Gamarnik, Neeman, Rácz and Tetali by showing that for any δ < 2, there exists α < 2 such that one cannot find a clique of order α log2n in G(n, 1/2) in n δ queries.


Information ◽  
2019 ◽  
Vol 10 (2) ◽  
pp. 61
Author(s):  
Xiaohuan Shan ◽  
Chunjie Jia ◽  
Linlin Ding ◽  
Xingyan Ding ◽  
Baoyan Song

A labeled graph is a special structure with node identification capability, which is often used in information networks, biological networks, and other fields. The subgraph query is widely used as an important means of graph data analysis. As the size of the labeled graph increases and changes dynamically, users tend to focus on the high-match results that are of interest to them, and they want to take advantage of the relationship and number of results to get the results of the query quickly. For this reason, we consider the individual needs of users and propose a dynamic Top-K interesting subgraph query. This method establishes a novel graph topology feature index (GTSF index) including a node topology feature index (NTF index) and an edge feature index (EF index), which can effectively prune and filter the invalid nodes and edges that do not meet the restricted condition. The multi-factor candidate set filtering strategy is proposed based on the GTSF index, which can be further pruned to obtain fewer candidate sets. Then, we propose a dynamic Top-K interesting subgraph query method based on the idea of the sliding window to realize the dynamic modification of the matching results of the subgraph in the dynamic evolution of the label graph, to ensure real-time and accurate results of the query. In addition, considering the factors, such as frequent Input/Output (I/O) and network communication overheads, the optimization mechanism of the graph changes and an incremental maintenance strategy for the index are proposed to reduce the huge cost of redundant operation and global updates. The experimental results show that the proposed method can effectively deal with a dynamic Top-K interesting subgraph query on a large-scale labeled graph, at the same time the optimization mechanism of graph changes and the incremental maintenance strategy of the index can effectively reduce the maintenance overheads.


Sign in / Sign up

Export Citation Format

Share Document