scholarly journals Best Answers over Incomplete Data : Complexity and First-Order Rewritings

Author(s):  
Amélie Gheerbrant ◽  
Cristina Sirangelo

Answering queries over incomplete data is ubiquitous in data management and in many AI applications that use query rewriting to take advantage of relational database technology. In these scenarios one lacks full information on the data but queries still need to be answered with certainty. The certainty aspect often makes query answering unfeasible except for restricted classes, such as unions of conjunctive queries. In addition often there are no, or very few certain answers, thus expensive computation is in vain. Therefore we study a relaxation of certain answers called best answers. They are defined as those answers for which there is no better one (that is, no answer true in more possible worlds). When certain answers exist the two notions coincide. We compare different ways of casting query answering as a decision problem and characterise its complexity for first-order queries, showing significant differences in the behavior of best and certain answers.We then restrict attention to best answers for unions of conjunctive queries and produce a practical algorithm for finding them based on query rewriting techniques.

Author(s):  
Marco Console ◽  
Paolo Guagliardo ◽  
Leonid Libkin

Querying incomplete data is an important task both in data management, and in many AI applications that use query rewriting to take advantage of relational database technology. Usually one looks for answers that are certain, i.e., true in every possible world represented by an incomplete database. For positive queries, expressed either in positive relational algebra or as unions of conjunctive queries, finding such answers can be done efficiently when databases and query answers are sets. Real-life databases however use bag, rather than set, semantics. For bags, instead of saying that a tuple is certainly in the answer, we have more detailed information: namely, the range of the numbers of occurrences of the tuple in query answers. We show that the behavior of positive queries is different under bag semantics: finding the minimum number of occurrences can still be done efficiently, but for maximum it becomes intractable. We use these results to investigate approximation schemes for computing certain answers to arbitrary first-order queries that have been proposed for set semantics. One of them cannot be adapted to bags, as it relies on the intractable maxima of occurrences, but another scheme only deals with minima, and we show how to adapt it to bag semantics without losing efficiency.


Semantic Web ◽  
2020 ◽  
pp. 1-25
Author(s):  
Enrique Matos Alfonso ◽  
Alexandros Chortaras ◽  
Giorgos Stamou

In this paper, we study the problem of query rewriting for disjunctive existential rules. Query rewriting is a well-known approach for query answering on knowledge bases with incomplete data. We propose a rewriting technique that uses negative constraints and conjunctive queries to remove the disjunctive components of disjunctive existential rules. This process eventually generates new non-disjunctive rules, i.e., existential rules. The generated rules can then be used to produce new rewritings using existing rewriting approaches for existential rules. With the proposed technique we are able to provide complete UCQ-rewritings for union of conjunctive queries with universally quantified negation. We implemented the proposed algorithm in the Completo system and performed experiments that evaluate the viability of the proposed solution.


Author(s):  
Tanya Braun ◽  
Ralf Möller

A standard approach for inference in probabilistic formalisms with first-order constructs is lifted variable elimination (LVE) for single queries. To handle multiple queries efficiently, the lifted junction tree algorithm (LJT) employs a first-order cluster representation of a model and LVE as a subroutine. Both algorithms answer conjunctive queries of propositional random variables, shattering the model on the query, which causes unnecessary groundings for conjunctive queries of interchangeable variables. This paper presents parameterised queries as a means to avoid groundings, applying the lifting idea to queries. Parameterised queries enable LVE and LJT to compute answers faster, while compactly representing queries and answers.


Author(s):  
Franz Baader ◽  
Stefan Borgwardt ◽  
Marcel Lippmann

We investigate ontology-based query answering (OBQA) in a setting where both the ontology and the query can refer to concrete values such as numbers and strings. In contrast to previous work on this topic, the built-in predicates used to compare values are not restricted to being unary. We introduce restrictions on these predicates and on the ontology language that allow us to reduce OBQA to query answering in databases using the so-called combined rewriting approach. Though at first sight our restrictions are different from the ones used in previous work, we show that our results strictly subsume some of the existing first-order rewritability results for unary predicates.


2020 ◽  
Vol 34 (03) ◽  
pp. 2734-2741 ◽  
Author(s):  
Medina Andresel ◽  
Magdalena Ortiz ◽  
Mantas Simkus

Among many solutions for extracting useful answers from incomplete data, ontology-mediated queries (OMQs) use domain knowledge to infer missing facts. We propose an extension of OMQs that allows us to make certain assumptions—for example, about parts of the data that may be unavailable at query time, or costly to query—and retrieve conditional answers, that is, tuples that become certain query answers when the assumptions hold. We show that querying in this powerful formalism often has no higher worst-case complexity than in plain OMQs, and that these queries are first-order rewritable for DL-Liteℛ. Rewritability is preserved even if we allow some use of closed predicates to combine the (partial) closed- and open-world assumptions. This is remarkable, as closed predicates are a very useful extension of OMQs, but they usually make query answering intractable in data complexity, even in very restricted settings.


2020 ◽  
Vol 34 (03) ◽  
pp. 3049-3056
Author(s):  
Heng Zhang ◽  
Yan Zhang ◽  
Jia-Huai You ◽  
Zhiyong Feng ◽  
Guifei Jiang

An ontology language for ontology mediated query answering (OMQA-language) is universal for a family of OMQA-languages if it is the most expressive one among this family. In this paper, we focus on three families of tractable OMQA-languages, including first-order rewritable languages and languages whose data complexity of the query answering is in AC0 or PTIME. On the negative side, we prove that there is, in general, no universal language for each of these families of languages. On the positive side, we propose a novel property, the locality, to approximate the first-order rewritability, and show that there exists a language of disjunctive embedded dependencies that is universal for the family of OMQA-languages with locality. All of these results apply to OMQA with query languages such as conjunctive queries, unions of conjunctive queries and acyclic conjunctive queries.


Author(s):  
Zhe Wang ◽  
Peng Xiao ◽  
Kewen Wang ◽  
Zhiqiang Zhuang ◽  
Hai Wan

Existential rules are an expressive ontology formalism for ontology-mediated query answering and thus query answering is of high complexity, while several tractable fragments have been identified. Existing systems based on first-order rewriting methods can lead to queries too large for DBMS to handle. It is shown that datalog rewriting can result in more compact queries, yet previously proposed datalog rewriting methods are mostly inefficient for implementation. In this paper, we fill the gap by proposing an efficient datalog rewriting approach for answering conjunctive queries over existential rules, and identify and combine existing fragments of existential rules for which our rewriting method terminates. We implemented a prototype system Drewer, and experiments show that it is able to handle a wide range of benchmarks in the literature. Moreover, Drewer shows superior or comparable performance over state-of-the-art systems on both the compactness of rewriting and the efficiency of query answering.


Author(s):  
Diego Calvanese ◽  
Julien Corman ◽  
Davide Lanti ◽  
Simon Razniewski

Counting answers to a query is an operation supported by virtually all database management systems. In this paper we focus on counting answers over a Knowledge Base (KB), which may be viewed as a database enriched with background knowledge about the domain under consideration. In particular, we place our work in the context of Ontology-Mediated Query Answering/Ontology-based Data Access (OMQA/OBDA), where the language used for the ontology is a member of the DL-Lite family and the data is a (usually virtual) set of assertions. We study the data complexity of query answering, for different members of the DL-Lite family that include number restrictions, and for variants of conjunctive queries with counting that differ with respect to their shape (connected, branching, rooted). We improve upon existing results by providing PTIME and coNP lower bounds, and upper bounds in PTIME and LOGSPACE. For the LOGSPACE case, we have devised a novel query rewriting technique into first-order logic with counting.


Author(s):  
Giovanni Amendola ◽  
Leonid Libkin

When a dataset is not fully specified and can represent many possible worlds, one commonly answers queries by computing certain answers to them. A natural way of defining certainty is to say that an answer is certain if it is consistent with query answers in all possible worlds, and is furthermore the most informative answer with this property. However, the existence and complexity of such answers is not yet well understood even for relational databases. Thus in applications one tends to use different notions, essentially the intersection of query answers in possible worlds. However, justification of such notions has long been questioned. This leads to two problems: are certain answers based on informativeness feasible in applications? and can a clean justification be provided for intersection-based notions? Our goal is to answer both. For the former, we show that such answers may not exist, or be very large, even in simple cases of querying incomplete data. For the latter, we add the concept of explanations to the notion of informativeness: it shows not only that one object is more informative than the other, but also says why this is so. This leads to a modified notion of certainty: explainable certain answers. We present a general framework for reasoning about them, and show that for open and closed world relational databases, they are precisely the common intersection-based notions of certainty.


Sign in / Sign up

Export Citation Format

Share Document