Symmetric continuous subgraph matching with bidirectional dynamic programming

In many real datasets such as social media streams and cyber data sources, graphs change over time through a graph update stream of edge insertions and deletions. Detecting critical patterns in such dynamic graphs plays an important role in various application domains such as fraud detection, cyber security, and recommendation systems for social networks. Given a dynamic data graph and a query graph, the continuous subgraph matching problem is to find all positive matches for each edge insertion and all negative matches for each edge deletion. The state-of-the-art algorithm TurboFlux uses a spanning tree of a query graph for filtering. However, using the spanning tree may have a low pruning power because it does not take into account all edges of the query graph. In this paper, we present a symmetric and much faster algorithm SymBi which maintains an auxiliary data structure based on a directed acyclic graph instead of a spanning tree, which maintains the intermediate results of bidirectional dynamic programming between the query graph and the dynamic graph. Extensive experiments with real and synthetic datasets show that SymBi outperforms the state-of-the-art algorithm by up to three orders of magnitude in terms of the elapsed time.

Download Full-text

Subgraph-Indexed Sequential Subdivision for Continuous Subgraph Matching on Dynamic Knowledge Graph

Complexity ◽

10.1155/2020/8871756 ◽

2020 ◽

Vol 2020 ◽

pp. 1-18

Author(s):

Yunhao Sun ◽

Guanyu Li ◽

Mengmeng Guan ◽

Bo Ning

Keyword(s):

Empirical Studies ◽

Search Space ◽

Knowledge Graph ◽

Dynamic Graph ◽

Matching Problem ◽

Flow Graph ◽

Query Graph ◽

Subgraph Matching ◽

Wide Range ◽

Multiple Edges

Continuous subgraph matching problem on dynamic graph has become a popular research topic in the field of graph analysis, which has a wide range of applications including information retrieval and community detection. Specifically, given a query graph q , an initial graph G 0 , and a graph update stream △ G i , the problem of continuous subgraph matching is to sequentially conduct all possible isomorphic subgraphs covering △ G i of q on G i (= G 0 ⊕ △ G i ). Since knowledge graph is a directed labeled multigraph having multiple edges between a pair of vertices, it brings new challenges for the problem focusing on dynamic knowledge graph. One challenge is that the multigraph characteristic of knowledge graph intensifies the complexity of candidate calculation, which is the combination of complex topological and attributed structures. Another challenge is that the isomorphic subgraphs covering a given region are conducted on a huge search space of seed candidates, which causes a lot of time consumption for searching the unpromising candidates. To address these challenges, a method of subgraph-indexed sequential subdivision is proposed to accelerating the continuous subgraph matching on dynamic knowledge graph. Firstly, a flow graph index is proposed to arrange the search space of seed candidates in topological knowledge graph and an adjacent index is designed to accelerate the identification of candidate activation states in attributed knowledge graph. Secondly, the sequential subdivision of flow graph index and the transition state model are employed to incrementally conduct subgraph matching and maintain the regional influence of changed candidates, respectively. Finally, extensive empirical studies on real and synthetic graphs demonstrate that our techniques outperform the state-of-the-art algorithms.

Download Full-text

Large-scale Semantic Parsing without Question-Answer Pairs

Transactions of the Association for Computational Linguistics ◽

10.1162/tacl_a_00190 ◽

2014 ◽

Vol 2 ◽

pp. 377-392 ◽

Cited By ~ 40

Author(s):

Siva Reddy ◽

Mirella Lapata ◽

Mark Steedman

Keyword(s):

Natural Language ◽

Large Scale ◽

Graph Matching ◽

State Of The Art ◽

The State ◽

Semantic Parsing ◽

Matching Problem ◽

Weak Supervision ◽

Benchmark Datasets

In this paper we introduce a novel semantic parsing approach to query Freebase in natural language without requiring manual annotations or question-answer pairs. Our key insight is to represent natural language via semantic graphs whose topology shares many commonalities with Freebase. Given this representation, we conceptualize semantic parsing as a graph matching problem. Our model converts sentences to semantic graphs using CCG and subsequently grounds them to Freebase guided by denotations as a form of weak supervision. Evaluation experiments on a subset of the Free917 and WebQuestions benchmark datasets show our semantic parser improves over the state of the art.

Download Full-text

An Overview of the State-of-the-Art of Cloud Computing Cyber-Security

Codes, Cryptology and Information Security - Lecture Notes in Computer Science ◽

10.1007/978-3-319-55589-8_4 ◽

2017 ◽

pp. 56-67 ◽

Cited By ~ 5

Author(s):

H. Bennasar ◽

A. Bendahmane ◽

M. Essaaidi

Keyword(s):

Cloud Computing ◽

Cyber Security ◽

State Of The Art ◽

The State

Download Full-text

Exact Methods for the Traveling Salesman Problem with Drone

Transportation Science ◽

10.1287/trsc.2020.1017 ◽

2021 ◽

Vol 55 (2) ◽

pp. 315-335

Author(s):

Roberto Roberti ◽

Mario Ruthmair

Keyword(s):

Dynamic Programming ◽

Traveling Salesman Problem ◽

State Of The Art ◽

The State ◽

Set Partitioning ◽

Traveling Salesman ◽

Mixed Integer ◽

Branch And Price ◽

Last Mile ◽

The Traveling Salesman Problem

Efficiently handling last-mile deliveries becomes more and more important nowadays. Using drones to support classical vehicles allows improving delivery schedules as long as efficient solution methods to plan last-mile deliveries with drones are available. We study exact solution approaches for some variants of the traveling salesman problem with drone (TSP-D) in which a truck and a drone are teamed up to serve a set of customers. This combination of truck and drone can exploit the benefits of both vehicle types: the truck has a large capacity but usually low travel speed in urban areas; the drone is faster and not restricted to street networks, but its range and carrying capacity are limited. We propose a compact mixed-integer linear program (MILP) for several TSP-D variants that is based on timely synchronizing truck and drone flows; such an MILP is easy to implement but nevertheless leads to competitive results compared with the state-of-the-art MILPs. Furthermore, we introduce dynamic programming recursions to model several TSP-D variants. We show how these dynamic programming recursions can be exploited in an exact branch-and-price approach based on a set partitioning formulation using ng-route relaxation and a three-level hierarchical branching. The proposed branch-and-price can solve instances with up to 39 customers to optimality outperforming the state-of-the-art by more than doubling the manageable instance size. Finally, we analyze different scenarios and show that even a single drone can significantly reduce a route’s completion time when the drone is sufficiently fast.

Download Full-text

A Comprehensive Tutorial and Survey of Applications of Deep Learning for Cyber Security

10.36227/techrxiv.11473377.v1 ◽

2020 ◽

Author(s):

vinayakumar ravi ◽

Soman KP ◽

Mamoun Alazab ◽

Sriram S ◽

Simran k

Keyword(s):

Deep Learning ◽

Cyber Security ◽

State Of The Art ◽

The State ◽

Research Papers ◽

Security Applications ◽

Learning Architectures

<div>This work aims to review the state-of-the-art deep learning architectures in Cyber Security applications by highlighting the contributions and challenges from various recent research papers.<br></div>

Download Full-text

Supergraph Topology Feature Index for Personalized Interesting Subgraph Query in Large Labeled Graphs

Complexity ◽

10.1155/2021/9274429 ◽

2021 ◽

Vol 2021 ◽

pp. 1-18

Author(s):

Xiaohuan Shan ◽

Haihai Li ◽

Chunjie Jia ◽

Dong Li ◽

Baoyan Song

Keyword(s):

Large Scale ◽

State Of The Art ◽

Original Data ◽

The State ◽

Average Performance ◽

Query Graph ◽

Subgraph Query ◽

Data Graph ◽

Partition Method ◽

The Impact

Interesting subgraph query aims to find subgraphs that are isomorphic to the given query graph from a data graph and rank the subgraphs according to their interestingness scores. However, the existing subgraph query approaches are inefficient when dealing with large-scale labeled data graph. This is caused by the following problems: (i) the existing work mainly focuses on unweighted query graphs, while ignoring the impact of query constraints on query results. (ii) Excessive number of subgraph candidates or complex joins between nodes in the subgraph candidates reduce the query efficiency. To solve these problems, this paper proposes an intelligent solution. Firstly, an Isotype Structure Graph Compression (ISGC) strategy is proposed to compress similar nodes in a graph to reduce the size of the graph and avoid unnecessary matching. Then, an auxiliary data structure Supergraph Topology Feature Index (STFIndex) is designed to replace the storage of the original data graph and improve the efficiency of an online query. After that, a partition method based on Edge Label Step Value (ELSV) is proposed to partition the index logically. In addition, a novel Top-K interest subgraph query approach is proposed, which consists of the multidimensional filtering (MDF) strategy, upper bound value (UBV) (Size-c) matching, and the optimizational join (QJ) method to filter out as many false subgraph candidates as possible to achieve fast joins. We conduct experiments on real and synthetic datasets. Experimental results show that the average performance of our approach is 1.35 higher than that of the state-of-the-art approaches when the query graph is unweighted, and the average performance of our approach is 2.88 higher than that of the state-of-the-art approaches when the query graph is weighted.

Download Full-text

A Comprehensive Tutorial and Survey of Applications of Deep Learning for Cyber Security

10.36227/techrxiv.11473377 ◽

2020 ◽

Author(s):

vinayakumar ravi ◽

Soman KP ◽

Mamoun Alazab ◽

Sriram S ◽

Simran k

Keyword(s):

Deep Learning ◽

Cyber Security ◽

State Of The Art ◽

The State ◽

Research Papers ◽

Security Applications ◽

Learning Architectures

Download Full-text

LEAP: A Generalization of the Landau-Vishkin Algorithm with Custom Gap Penalties

10.1101/133157 ◽

2017 ◽

Author(s):

Hongyi Xin ◽

Jeremie Kim ◽

Sunny Nahar ◽

Can Alkan ◽

Onur Mutlu

Keyword(s):

State Of The Art ◽

String Matching ◽

The State ◽

Levenshtein Distance ◽

Approximate String Matching ◽

Matching Problem ◽

De Bruijn Sequence ◽

Scoring Schemes ◽

Bit Vector ◽

Selection Of

AbstractMotivationApproximate String Matching is a pivotal problem in the field of computer science. It serves as an integral component for many string algorithms, most notably, DNA read mapping and alignment. The improved LV algorithm proposes an improved dynamic programming strategy over the banded Smith-Waterman algorithm but suffers from support of a limited selection of scoring schemes. In this paper, we propose the Leaping Toad problem, a generalization of the approximate string matching problem, as well as LEAP, a generalization of the Landau-Vishkin’s algorithm that solves the Leaping Toad problem under a broader selection of scoring schemes.ResultsWe benchmarked LEAP against 3 state-of-the-art approximate string matching implementations. We show that when using a bit-vectorized de Bruijn sequence based optimization, LEAP is up to 7.4x faster than the state-of-the-art bit-vector Levenshtein distance implementation and up to 32x faster than the state-of-the-art affine-gap-penalty parallel Needleman Wunsch Implementation.AvailabilityWe provide an implementation of LEAP in C++ at github.com/CMU-SAFARI/[email protected], [email protected] or [email protected]

Download Full-text

Optimized Distributed Subgraph Matching Algorithm Based on Partition Replication

Electronics ◽

10.3390/electronics9010184 ◽

2020 ◽

Vol 9 (1) ◽

pp. 184

Author(s):

Ling Yuan ◽

Jiali Bin ◽

Peng Pan

Keyword(s):

Large Scale ◽

High Efficiency ◽

Search Space ◽

Dynamic Graph ◽

Matching Algorithm ◽

Query Graph ◽

Subgraph Matching ◽

Match Algorithm ◽

Large Scale Data ◽

Scale Data

At present, with the explosive growth of data scale, subgraph matching for massive graph data is difficult to satisfy with efficiency. Meanwhile, the graph index used in existing subgraph matching algorithm is difficult to update and maintain when facing dynamic graphs. We propose a distributed subgraph matching algorithm based on Partition Replica (noted as PR-Match) to process the partition and storage of large-scale data graphs. The PR-Match algorithm first splits the query graph into sub-queries, then assigns the sub-query to each node for sub-graph matching, and finally merges the matching results. In the PR-Match algorithm, we propose a heuristic rule based on prediction cost to select the optimal merging plan, which greatly reduces the cost of merging. In order to accelerate the matching speed of the sub-query graph, a vertex code based on the vertex neighbor label signature is proposed, which greatly reduces the search space for the subquery. As the vertex code is based on the increment, the problem that the feature-based graph index is difficult to maintain in the face of the dynamic graph is solved. An abundance of experiments on real and synthetic datasets demonstrate the high efficiency and strong scalability of the PR-Match algorithm when handling large-scale data graphs.

Download Full-text

The State of the Art in Dynamic Graph Algorithms

SOFSEM 2018: Theory and Practice of Computer Science - Lecture Notes in Computer Science ◽

10.1007/978-3-319-73117-9_3 ◽

2017 ◽

pp. 40-44

Author(s):

Monika Henzinger

Keyword(s):

Graph Algorithms ◽

State Of The Art ◽

The State ◽

Dynamic Graph ◽

Dynamic Graph Algorithms

Download Full-text