From Relation Algebra to Semi-join Algebra: An Approach to Graph Query Optimization

Abstract Many graph query languages rely on composition to navigate graphs and select nodes of interest, even though evaluating compositions of relations can be costly. Often, this need for composition can be reduced by rewriting toward queries using semi-joins instead, resulting in a significant reduction of the query evaluation cost. We study techniques to recognize and apply such rewritings. Concretely, we study the relationship between the expressive power of the relation algebras, which heavily rely on composition, and the semi-join algebras, which replace composition in favor of semi-joins. Our main result is that each fragment of the relation algebras where intersection and/or difference is only used on edges (and not on complex compositions) is expressively equivalent to a fragment of the semi-join algebras. This expressive equivalence holds for node queries evaluating to sets of nodes. For practical relevance, we exhibit constructive rules for rewriting relation algebra queries to semi-join algebra queries and prove that they lead to only a well-bounded increase in the number of steps needed to evaluate the rewritten queries. In addition, on sibling-ordered trees, we establish new relationships among the expressive power of Regular XPath, Conditional XPath, FO-logic and the semi-join algebra augmented with restricted fixpoint operators.

Download Full-text

Query Rewriting for Incremental Continuous Query Evaluation in HIFUN

Algorithms ◽

10.3390/a14050149 ◽

2021 ◽

Vol 14 (5) ◽

pp. 149

Author(s):

Petros Zervoudakis ◽

Haridimos Kondylakis ◽

Nicolas Spyratos ◽

Dimitris Plexousakis

Keyword(s):

Query Optimization ◽

Query Language ◽

Computational Cost ◽

Continuous Queries ◽

Continuous Query ◽

Query Rewriting ◽

Query Evaluation ◽

Clear Separation ◽

Complete Dataset ◽

High Level

HIFUN is a high-level query language for expressing analytic queries of big datasets, offering a clear separation between the conceptual layer, where analytic queries are defined independently of the nature and location of data, and the physical layer, where queries are evaluated. In this paper, we present a methodology based on the HIFUN language, and the corresponding algorithms for the incremental evaluation of continuous queries. In essence, our approach is able to process the most recent data batch by exploiting already computed information, without requiring the evaluation of the query over the complete dataset. We present the generic algorithm which we translated to both SQL and MapReduce using SPARK; it implements various query rewriting methods. We demonstrate the effectiveness of our approach in temrs of query answering efficiency. Finally, we show that by exploiting the formal query rewriting methods of HIFUN, we can further reduce the computational cost, adding another layer of query optimization to our implementation.

Download Full-text

Web Retrieval of XML Documents

Web-Enabled Systems Integration ◽

10.4018/978-1-59140-041-7.ch009 ◽

2011 ◽

pp. 170-199

Author(s):

Barbara Catania ◽

Elena Ferrari

Keyword(s):

Expressive Power ◽

Data Representation ◽

Query Languages ◽

Heterogeneous Data ◽

Data Sources ◽

Xml Data ◽

Web Documents ◽

Web Retrieval ◽

Heterogeneous Data Sources ◽

The Web

Web is characterized by a huge amount of very heterogeneous data sources, that differ both in media support and format representation. In this scenario, there is the need of an integrating approach for querying heterogeneous Web documents. To this purpose, XML can play an important role since it is becoming a standard for data representation and exchange over the Web. Due to its flexibility, XML is currently being used as an interface language over the Web, by which (part of) document sources are represented and exported. Under this assumption, the problem of querying heterogeneous sources can be reduced to the problem of querying XML data sources. In this chapter, we first survey the most relevant query languages for XML data proposed both by the scientific community and by standardization committees, e.g., W3C, mainly focusing on their expressive power. Then, we investigate how typical Information Retrieval concepts, such as ranking, similarity-based search, and profile-based search, can be applied to XML query languages. Commercial products based on the considered approaches are then briefly surveyed. Finally, we conclude the chapter by providing an overview of the most promising research trends in the fields.

Download Full-text

On the Kolmogorov expressive power of boolean query languages

Database Theory — ICDT '95 - Lecture Notes in Computer Science ◽

10.1007/3-540-58907-4_9 ◽

1995 ◽

pp. 97-110 ◽

Cited By ~ 1

Author(s):

Jerzy Tyszkiewicz

Keyword(s):

Expressive Power ◽

Query Languages ◽

Boolean Query

Download Full-text

On graph query optimization in large networks

Proceedings of the VLDB Endowment ◽

10.14778/1920841.1920887 ◽

2010 ◽

Vol 3 (1-2) ◽

pp. 340-351 ◽

Cited By ~ 119

Author(s):

Peixiang Zhao ◽

Jiawei Han

Keyword(s):

Query Optimization ◽

Large Networks ◽

Graph Query

Download Full-text

THE VARIETY OF COSET RELATION ALGEBRAS

Journal of Symbolic Logic ◽

10.1017/jsl.2018.48 ◽

2018 ◽

Vol 83 (04) ◽

pp. 1595-1609 ◽

Cited By ~ 2

Author(s):

STEVEN GIVANT ◽

HAJNAL ANDRÉKA

Keyword(s):

Large Class ◽

Identity Element ◽

Relation Algebra ◽

Order Theory ◽

Relation Algebras ◽

First Order ◽

First Order Theory ◽

Atomic Pair ◽

Finite Set ◽

Image Position

AbstractGivant [6] generalized the notion of an atomic pair-dense relation algebra from Maddux [13] by defining the notion of a measurable relation algebra, that is to say, a relation algebra in which the identity element is a sum of atoms that can be measured in the sense that the “size” of each such atom can be defined in an intuitive and reasonable way (within the framework of the first-order theory of relation algebras). In Andréka--Givant [2], a large class of examples of such algebras is constructed from systems of groups, coordinated systems of isomorphisms between quotients of the groups, and systems of cosets that are used to “shift” the operation of relative multiplication. In Givant--Andréka [8], it is shown that the class of these full coset relation algebras is adequate to the task of describing all measurable relation algebras in the sense that every atomic and complete measurable relation algebra is isomorphic to a full coset relation algebra.Call an algebra $\mathfrak{A}$ a coset relation algebra if $\mathfrak{A}$ is embeddable into some full coset relation algebra. In the present article, it is shown that the class of coset relation algebras is equationally axiomatizable (that is to say, it is a variety), but that no finite set of sentences suffices to axiomatize the class (that is to say, the class is not finitely axiomatizable).

Download Full-text

Graph query algebra and visual proximity rules for biological pathway exploration

Information Visualization ◽

10.1177/1473871616666394 ◽

2016 ◽

Vol 16 (3) ◽

pp. 217-231

Author(s):

Keqin Wu ◽

Liang Sun ◽

Carl Schmidt ◽

Jian Chen

Keyword(s):

Biological Pathway ◽

Exploration Process ◽

Graph Query ◽

The Relationship ◽

Pathway Graph ◽

Selection Of

We present the design and validation of an example-based pathway graph query algebra and visual proximity rules to address challenging large biological pathway exploration tasks. Our pathway graph query algebra interprets relationship queries given by selected examples to find a match for, extract identical parts between, or trace a path from any pathway components. To support the relationship query, users can composite pathway visualizations through visual proximity rules that use proximity to infer users’ intentions in the exploration process. By allowing selection of one or more objects from multiple on-screen grouped graphs as query inputs and using the query outputs as next-query inputs, pathway graph query algebra and visual proximity rules achieve intuitiveness, concurrence, and dynamics for pathway exploration.

Download Full-text

Non-finite-axiomatizability results in algebraic logic

Journal of Symbolic Logic ◽

10.2307/2275434 ◽

1992 ◽

Vol 57 (3) ◽

pp. 832-843 ◽

Cited By ~ 10

Author(s):

Balázs Biró

Keyword(s):

Algebraic Logic ◽

Relation Algebra ◽

Equational Theory ◽

Negative Answer ◽

Relation Algebras ◽

Representable Relation Algebra ◽

Finite Set ◽

Variable Symbol ◽

Axiom Systems ◽

Representable Relation

This paper deals with relation, cylindric and polyadic equality algebras. First of all it addresses a problem of B. Jónsson. He asked whether relation set algebras can be expanded by finitely many new operations in a “reasonable” way so that the class of these expansions would possess a finite equational base. The present paper gives a negative answer to this problem: Our main theorem states that whenever Rs+ is a class that consists of expansions of relation set algebras such that each operation of Rs+ is logical in Jónsson's sense, i.e., is the algebraic counterpart of some (derived) connective of first-order logic, then the equational theory of Rs+ has no finite axiom systems. Similar results are stated for the other classes mentioned above. As a corollary to this theorem we can solve a problem of Tarski and Givant [87], Namely, we claim that the valid formulas of certain languages cannot be axiomatized by a finite set of logical axiom schemes. At the same time we give a negative solution for a version of a problem of Henkin and Monk [74] (cf. also Monk [70] and Németi [89]).Throughout we use the terminology, notation and results of Henkin, Monk, Tarski [71] and [85]. We also use results of Maddux [89a].Notation. RA denotes the class of relation algebras, Rs denotes the class of relation set algebras and RRA is the class of representable relation algebras, i.e. the class of subdirect products of relation set algebras. The symbols RA, Rs and RRA abbreviate also the expressions relation algebra, relation set algebra and representable relation algebra, respectively.For any class C of similar algebras EqC is the set of identities that hold in C, while Eq1C is the set of those identities in EqC that contain at most one variable symbol. (We note that Henkin et al. [85] uses the symbol EqC in another sense.)

Download Full-text

Pseudo-finite homogeneity and saturation

Journal of Symbolic Logic ◽

10.2307/2586806 ◽

1999 ◽

Vol 64 (4) ◽

pp. 1689-1699 ◽

Cited By ~ 5

Author(s):

Jörg Flum ◽

Martin Ziegler

Keyword(s):

Expressive Power ◽

Query Languages ◽

Stable Theory ◽

Finite Cover ◽

Database Query ◽

Homogeneity Property ◽

Finite States

AbstractWhen analyzing database query languages a roperty, of theories, the pseudo-finite homogeneity property, has been introduced and applied (cf. [3]). We show that a stable theory has the pseudo-finite homogeneity property just in case its expressive power for finite states is bounded. Moreover, we introduce the corresponding pseudo-finite saturation property and show that a theory fails to have the finite cover property if and only if it has the pseudo-finite saturation property.

Download Full-text