query evaluation
Recently Published Documents


TOTAL DOCUMENTS

382
(FIVE YEARS 39)

H-INDEX

32
(FIVE YEARS 3)

2022 ◽  
Vol 13 (2) ◽  
pp. 1-28
Author(s):  
Yan Tang ◽  
Weilong Cui ◽  
Jianwen Su

A business process (workflow) is an assembly of tasks to accomplish a business goal. Real-world workflow models often demanded to change due to new laws and policies, changes in the environment, and so on. To understand the inner workings of a business process to facilitate changes, workflow logs have the potential to enable inspecting, monitoring, diagnosing, analyzing, and improving the design of a complex workflow. Querying workflow logs, however, is still mostly an ad hoc practice by workflow managers. In this article, we focus on the problem of querying workflow log concerning both control flow and dataflow properties. We develop a query language based on “incident patterns” to allow the user to directly query workflow logs instead of having to transform such queries into database operations. We provide the formal semantics and a query evaluation algorithm of our language. By deriving an accurate cost model, we develop an optimization mechanism to accelerate query evaluation. Our experiment results demonstrate the effectiveness of the optimization and achieves up to 50× speedup over an adaption of existing evaluation method.


2022 ◽  
Vol Volume 18, Issue 1 ◽  
Author(s):  
Antoine Amarilli ◽  
İsmail İlkan Ceylan

We study the problem of query evaluation on probabilistic graphs, namely, tuple-independent probabilistic databases over signatures of arity two. We focus on the class of queries closed under homomorphisms, or, equivalently, the infinite unions of conjunctive queries. Our main result states that the probabilistic query evaluation problem is #P-hard for all unbounded queries from this class. As bounded queries from this class are equivalent to a union of conjunctive queries, they are already classified by the dichotomy of Dalvi and Suciu (2012). Hence, our result and theirs imply a complete data complexity dichotomy, between polynomial time and #P-hardness, on evaluating homomorphism-closed queries over probabilistic graphs. This dichotomy covers in particular all fragments of infinite unions of conjunctive queries over arity-two signatures, such as negation-free (disjunctive) Datalog, regular path queries, and a large class of ontology-mediated queries. The dichotomy also applies to a restricted case of probabilistic query evaluation called generalized model counting, where fact probabilities must be 0, 0.5, or 1. We show the main result by reducing from the problem of counting the valuations of positive partitioned 2-DNF formulae, or from the source-to-target reliability problem in an undirected graph, depending on properties of minimal models for the query.


2021 ◽  
Vol 50 (2) ◽  
pp. 6-17
Author(s):  
Johannes Doleschal ◽  
Benny Kimelfeld ◽  
Wim Martens

A common conceptual view of text analysis is that of a two-step process, where we first extract relations from text documents and then apply a relational query over the result. Hence, text analysis shares technical challenges with, and can draw ideas from, relational databases. A framework that formally instantiates this connection is that of the document spanners. In this article, we review recent advances in various research efforts that adapt fundamental database concepts to text analysis through the lens of document spanners. Among others, we discuss aspects of query evaluation, aggregate queries, provenance, and distributed query planning.


2021 ◽  
Vol 99 ◽  
pp. 101738
Author(s):  
Ishaq Zouaghi ◽  
Amin Mesmoudi ◽  
Jorge Galicia ◽  
Ladjel Bellatreche ◽  
Taoufik Aguili
Keyword(s):  

2021 ◽  
Vol 46 (2) ◽  
pp. 1-45
Author(s):  
Amine Mhedhbi ◽  
Chathura Kankanamge ◽  
Semih Salihoglu

We study the problem of optimizing one-time and continuous subgraph queries using the new worst-case optimal join plans. Worst-case optimal plans evaluate queries by matching one query vertex at a time using multiway intersections. The core problem in optimizing worst-case optimal plans is to pick an ordering of the query vertices to match. We make two main contributions: 1. A cost-based dynamic programming optimizer for one-time queries that (i) picks efficient query vertex orderings for worst-case optimal plans and (ii) generates hybrid plans that mix traditional binary joins with worst-case optimal style multiway intersections. In addition to our optimizer, we describe an adaptive technique that changes the query vertex orderings of the worst-case optimal subplans during query execution for more efficient query evaluation. The plan space of our one-time optimizer contains plans that are not in the plan spaces based on tree decompositions from prior work. 2. A cost-based greedy optimizer for continuous queries that builds on the delta subgraph query framework. Given a set of continuous queries, our optimizer decomposes these queries into multiple delta subgraph queries, picks a plan for each delta query, and generates a single combined plan that evaluates all of the queries. Our combined plans share computations across operators of the plans for the delta queries if the operators perform the same intersections. To increase the amount of computation shared, we describe an additional optimization that shares partial intersections across operators. Our optimizers use a new cost metric for worst-case optimal plans called intersection-cost . When generating hybrid plans, our dynamic programming optimizer for one-time queries combines intersection-cost with the cost of binary joins. We demonstrate the effectiveness of our plans, adaptive technique, and partial intersection sharing optimization through extensive experiments. Our optimizers are integrated into GraphflowDB.


Algorithms ◽  
2021 ◽  
Vol 14 (5) ◽  
pp. 149
Author(s):  
Petros Zervoudakis ◽  
Haridimos Kondylakis ◽  
Nicolas Spyratos ◽  
Dimitris Plexousakis

HIFUN is a high-level query language for expressing analytic queries of big datasets, offering a clear separation between the conceptual layer, where analytic queries are defined independently of the nature and location of data, and the physical layer, where queries are evaluated. In this paper, we present a methodology based on the HIFUN language, and the corresponding algorithms for the incremental evaluation of continuous queries. In essence, our approach is able to process the most recent data batch by exploiting already computed information, without requiring the evaluation of the query over the complete dataset. We present the generic algorithm which we translated to both SQL and MapReduce using SPARK; it implements various query rewriting methods. We demonstrate the effectiveness of our approach in temrs of query answering efficiency. Finally, we show that by exploiting the formal query rewriting methods of HIFUN, we can further reduce the computational cost, adding another layer of query optimization to our implementation.


2021 ◽  
Vol 179 (2) ◽  
pp. 113-134
Author(s):  
Samira Akili ◽  
Matthias Weidlich

Complex event processing (CEP) evaluates queries over streams of event data to detect situations of interest. If the event data are produced by geographically distributed sources, CEP may exploit in-network processing that distributes the evaluation of a query among the nodes of a network. To this end, a query is modularized and individual query operators are assigned to nodes, especially those that act as data sources. Existing solutions for such operator placement, however, are limited in that they assume all query results to be gathered at one designated node, commonly referred to as a sink. Hence, existing techniques postulate a hierarchical structure of the network that generates and processes the event data. This largely neglects the optimisation potential that stems from truly decentralised query evaluation with potentially many sinks. To address this gap, in this paper, we propose Multi-Sink Evaluation (MuSE) graphs as a formal computational model to evaluate common CEP queries in a decentralised manner. We further prove the completeness of query evaluation under this model. Striving for distributed CEP that can scale to large volumes of high-frequency event streams, we show how to reason on the network costs induced by distributed query evaluation and prune inefficient query execution plans. As such, our work lays the foundation for distributed CEP that is both, sound and efficient.


2021 ◽  
pp. 374-391
Author(s):  
Gianluca Cima ◽  
Domenico Lembo ◽  
Lorenzo Marconi ◽  
Riccardo Rosati ◽  
Domenico Fabio Savo

Author(s):  
Sabrina De Capitani di Vimercati ◽  
Sara Foresti ◽  
Sushil Jajodia ◽  
Giovanni Livraga ◽  
Stefano Paraboschi ◽  
...  

Sign in / Sign up

Export Citation Format

Share Document