Managing Inconsistent Databases Using Active Integrity Constraints

Integrity constraints are a fundamental part of a database schema. They are generally used to define constraints on data (functional dependencies, inclusion dependencies, exclusion dependencies, etc.), and their enforcement ensures a semantically correct state of a database. As the presence of data inconsistent with respect to integrity constraints is not unusual, its management plays a key role in all the areas in which duplicate or conflicting information is likely to occur, such as database integration, data warehousing, and federated databases (Bry, 1997; Lin, 1996; Subrahmanian, 1994). It is well known that the presence of inconsistent data can be managed by “repairing” the database, that is, by providing consistent databases, obtained by a minimal set of update operations on the inconsistent original environment, or by consistently answering queries posed over the inconsistent database.

Download Full-text

Approximate denial constraints

Proceedings of the VLDB Endowment ◽

10.14778/3401960.3401966 ◽

2020 ◽

Vol 13 (10) ◽

pp. 1682-1695

Author(s):

Ester Livshits ◽

Alireza Heidari ◽

Ihab F. Ilyas ◽

Benny Kimelfeld

Keyword(s):

Integrity Constraints ◽

Functional Dependencies ◽

Approximation Function ◽

Inconsistent Databases ◽

The Past

The problem of mining integrity constraints from data has been extensively studied over the past two decades for commonly used types of constraints, including the classic Functional Dependencies (FDs) and the more general Denial Constraints (DCs). In this paper, we investigate the problem of mining from data approximate DCs, that is, DCs that are "almost" satisfied. Approximation allows us to discover more accurate constraints in inconsistent databases and detect rules that are generally correct but may have a few exceptions. It also allows to avoid overfitting and obtain constraints that are more general, more natural, and less contrived. We introduce the algorithm ADCMiner for mining approximate DCs. An important feature of this algorithm is that it does not assume any specific approximation function for DCs, but rather allows for arbitrary approximation functions that satisfy some natural axioms that we define in the paper. We also show how our algorithm can be combined with sampling to return highly accurate results considerably faster.

Download Full-text

Consistent Queries over Databases with Integrity Constraints

Database Technologies ◽

10.4018/978-1-60566-058-5.ch122 ◽

2009 ◽

pp. 2051-2058

Author(s):

Luciano Caroprese ◽

Cristian Molinaro ◽

Irina Trubitsyna ◽

Ester Zumpano

Keyword(s):

Integrity Constraints ◽

Database Integration ◽

Functional Dependencies ◽

Distributed Information ◽

Semantic Query ◽

Cooperative Query Answering ◽

Semantic Query Optimization ◽

Integration Architecture ◽

Source Of Information ◽

Different Sources

Integrating data from different sources consists of two main steps, the first in which the various relations are merged together, and the second in which some tuples are removed (or inserted) from the resulting database in order to satisfy integrity constraints. There are several ways to integrate databases or possibly distributed information sources, but whatever integration architecture we choose, the heterogeneity of the sources to be integrated causes subtle problems. In particular, the database obtained from the integration process may be inconsistent with respect to integrity constraints, that is, one or more integrity constraints are not satisfied. Integrity constraints represent an important source of information about the real world. They are usually used to define constraints on data (functional dependencies, inclusion dependencies, etc.) and have, nowadays, a wide applicability in several contexts such as semantic query optimization, cooperative query answering, database integration, and view update.

Download Full-text

Preferred Repairs for Inconsistent Databases

Encyclopedia of Database Technologies and Applications ◽

10.4018/978-1-59140-560-3.ch080 ◽

2005 ◽

pp. 480-485 ◽

Cited By ~ 1

Author(s):

Sergio Greco ◽

Cristina Sirangelo ◽

Irina Trubitsyna ◽

Ester Zumpano

Keyword(s):

Information Sources ◽

Integrity Constraints ◽

Inconsistent Databases ◽

Inconsistent Information ◽

Conflicting Information

The objective of this article is to investigate the problems related to the extensional integration of information sources. In particular, we propose an approach for managing inconsistent databases, that is, databases violating integrity constraints. The problem of dealing with inconsistent information has recently assumed additional relevance as it plays a key role in all the areas in which duplicate information or conflicting information is likely to occur (Agarwal et al., 1995; Arenas, Bertossi & Chomicki, 1999; Bry, 1997; Dung, 1996; Lin & Mendelzon, 1999; Subrahmanian, 1994).

Download Full-text

Merging, Repairing, and Querying Inconsistent Databases

Handbook of Research on Innovations in Database Technologies and Applications ◽

10.4018/978-1-60566-242-8.ch039 ◽

2009 ◽

pp. 358-364

Author(s):

Luciano Caroprese ◽

Ester Zumpano

Keyword(s):

Data Integration ◽

Information Sources ◽

Proper Subset ◽

Data Sources ◽

Integrity Constraints ◽

Heterogeneous Information ◽

Inconsistent Databases ◽

Integration Strategy ◽

Heterogeneous Information Sources ◽

Inconsistent Database

Data integration aims to provide a uniform integrated access to multiple heterogeneous information sources designed independently and having strictly related contents. However, the integrated view, constructed by integrating the information provided by the different data sources by means of a specified integration strategy could potentially contain inconsistent data; that is, it can violate some of the constraints defined on the data. In the presence of an inconsistent integrated database, in other words, a database that does not satisfy some integrity constraints, two possible solutions have been investigated in the literature (Agarwal, Keller, Wiederhold, & Saraswat, 1995; Bry, 1997; Calì, Calvanese, De Giacomo, & Lenzerini, 2002; Dung, 1996; Grant & Subrahmanian, 1995; S. Greco & Zumpano, 2000; Lin & Mendelzon, 1999): repairing the database or computing consistent answers over the inconsistent database. Intuitively, a repair of the database consists of deleting or inserting a minimal number of tuples so that the resulting database is consistent, whereas the computation of the consistent answer consists of selecting the set of certain tuples (i.e., those belonging to all repaired databases) and the set of uncertain tuples (i.e., those belonging to a proper subset of repaired databases).

Download Full-text

Consistent Queries Over Databases with Integrity Constraints

Database Integrity ◽

10.4018/978-1-930708-38-9.ch006 ◽

2002 ◽

pp. 172-202

Author(s):

Sergio Greco ◽

Ester Zumpano

Keyword(s):

Query Optimization ◽

Information Sources ◽

Integrity Constraints ◽

Database Integration ◽

Functional Dependencies ◽

Multiple Sources ◽

Semantic Query ◽

Cooperative Query Answering ◽

Semantic Query Optimization ◽

Source Of Information

Integrity constraints represent an important source of information about the real world. They are usually used to define constraints on data (functional dependencies, inclusion dependencies, etc.). Nowadays integrity constraints have a wide applicability in several contexts such as semantic query optimization, cooperative query answering, database integration and view update. Often databases may be inconsistent with respect to integrity constraints, that is, one or more integrity constraints are not satisfied. This may happen, for instance, when the database is obtained from the integration of different information sources. The integration of knowledge from multiple sources is an important aspect in several areas such as data warehousing, database integration, automated reasoning systems and active reactive databases.

Download Full-text

Integrity Constraints in Federated Databases

Database Security - IFIP Advances in Information and Communication Technology ◽

10.1007/978-0-387-35167-4_4 ◽

1997 ◽

pp. 43-57 ◽

Cited By ~ 2

Author(s):

Martin S. Olivier

Keyword(s):

Integrity Constraints ◽

Federated Databases

Download Full-text

Sign language transcription with syncWRITER

Sign Language & Linguistics ◽

10.1075/sll.4.1-2.19han ◽

2001 ◽

Vol 4 (1-2) ◽

pp. 275-283

Author(s):

Thomas Hanke

Keyword(s):

Sign Language ◽

Large Scale ◽

Digital Video ◽

Data Retrieval ◽

Database Integration ◽

Text Editor ◽

Integration Data

syncWRITER is an interlinear text editor with a focus on the presentation of transcribed data. As it seamlessly integrates digital video, it is a useful tool for sign language transcription. This article discusses syncWRITER’s limitations in the areas that turn out to be essential in large-scale transcription projects. These are synchronization, multi-user database integration, data retrieval, and coreference handling.

Download Full-text

Polynomial time queries over inconsistent databases with functional dependencies and foreign keys

Data & Knowledge Engineering ◽

10.1016/j.datak.2010.02.007 ◽

2010 ◽

Vol 69 (7) ◽

pp. 709-722 ◽

Cited By ~ 9

Author(s):

Cristian Molinaro ◽

Sergio Greco

Keyword(s):

Polynomial Time ◽

Functional Dependencies ◽

Inconsistent Databases

Download Full-text

Schema Matching Quality: Thesaurus as the Matcher

Jurnal Teknologi ◽

10.11113/jt.v70.3514 ◽

2014 ◽

Vol 70 (5) ◽

Author(s):

Thabit Sabbah ◽

Ali Selamat

Keyword(s):

Information Retrieval ◽

Data Integration ◽

Query Processing ◽

Data Warehousing ◽

Schema Matching ◽

Semantic Query ◽

Integration Data ◽

F Measure

Thesaurus is used in many Information Retrieval (IR) applications such as data integration, data warehousing, semantic query processing and classifiers. It was also utilized to solve the problem of schema matching. Considering the fact of existence of many thesauri for a certain area of knowledge, the quality of schema matching results when using different thesauri in the same field is not predictable. In this paper, we propose a methodology to study the performance of the thesaurus in solving schema matching. The paper also presents results of experiments using different thesauri. Precision, recall, F-measure, and similarity average were calculated to show that the quality of matching changed according to the used thesaurus.

Download Full-text

Satisfaction and Implication of Integrity Constraints in Ontology-based Data Access

Proceedings of the Twenty-Eighth International Joint Conference on Artificial Intelligence ◽

10.24963/ijcai.2019/253 ◽

2019 ◽

Cited By ~ 1

Author(s):

Charalampos Nikolaou ◽

Bernardo Cuenca Grau ◽

Egor V. Kostylev ◽

Mark Kaminski ◽

Ian Horrocks

Keyword(s):

Constraint Satisfaction ◽

Data Access ◽

Integrity Constraints ◽

Functional Dependencies ◽

Complexity Bounds ◽

Decidability And Complexity

We extend ontology-based data access with integrity constraints over both the source and target schemas. The relevant reasoning problems in this setting are constraint satisfaction—to check whether a database satisfies the target constraints given the mappings and the ontology—and source-to-target (resp., target-to-source) constraint implication, which is to check whether a target constraint (resp., a source constraint) is satisfied by each database satisfying the source constraints (resp., the target constraints). We establish decidability and complexity bounds for all these problems in the case where ontologies are expressed in DL-LiteR and constraints range from functional dependencies to disjunctive tuple-generating dependencies.

Download Full-text