Inconsistent Databases


2020 ◽  
Vol 13 (10) ◽  
pp. 1682-1695
Author(s):  
Ester Livshits ◽  
Alireza Heidari ◽  
Ihab F. Ilyas ◽  
Benny Kimelfeld

The problem of mining integrity constraints from data has been extensively studied over the past two decades for commonly used types of constraints, including the classic Functional Dependencies (FDs) and the more general Denial Constraints (DCs). In this paper, we investigate the problem of mining from data approximate DCs, that is, DCs that are "almost" satisfied. Approximation allows us to discover more accurate constraints in inconsistent databases and detect rules that are generally correct but may have a few exceptions. It also allows to avoid overfitting and obtain constraints that are more general, more natural, and less contrived. We introduce the algorithm ADCMiner for mining approximate DCs. An important feature of this algorithm is that it does not assume any specific approximation function for DCs, but rather allows for arbitrary approximation functions that satisfy some natural axioms that we define in the paper. We also show how our algorithm can be combined with sampling to return highly accurate results considerably faster.





Author(s):  
Sergio Greco ◽  
Cristina Sirangelo ◽  
Irina Trubitsyna ◽  
Ester Zumpano

The objective of this article is to investigate the problems related to the extensional integration of information sources. In particular, we propose an approach for managing inconsistent databases, that is, databases violating integrity constraints. The problem of dealing with inconsistent information has recently assumed additional relevance as it plays a key role in all the areas in which duplicate information or conflicting information is likely to occur (Agarwal et al., 1995; Arenas, Bertossi & Chomicki, 1999; Bry, 1997; Dung, 1996; Lin & Mendelzon, 1999; Subrahmanian, 1994).



Author(s):  
Sergio Flesca ◽  
Sergio Greco ◽  
Ester Zumpano

Integrity constraints are a fundamental part of a database schema. They are generally used to define constraints on data (functional dependencies, inclusion dependencies, exclusion dependencies, etc.), and their enforcement ensures a semantically correct state of a database. As the presence of data inconsistent with respect to integrity constraints is not unusual, its management plays a key role in all the areas in which duplicate or conflicting information is likely to occur, such as database integration, data warehousing, and federated databases (Bry, 1997; Lin, 1996; Subrahmanian, 1994). It is well known that the presence of inconsistent data can be managed by “repairing” the database, that is, by providing consistent databases, obtained by a minimal set of update operations on the inconsistent original environment, or by consistently answering queries posed over the inconsistent database.



Author(s):  
Gianluigi Greco ◽  
Sergio Greco ◽  
Ester Zumpano




2003 ◽  
Vol 296 (3) ◽  
pp. 405-434 ◽  
Author(s):  
Marcelo Arenas ◽  
Leopoldo Bertossi ◽  
Jan Chomicki ◽  
Xin He ◽  
Vijay Raghavan ◽  
...  


2012 ◽  
Vol 7 (8) ◽  
Author(s):  
Dong Xie ◽  
Xinbo Chen ◽  
Yan Zhu


2003 ◽  
Vol 3 (4+5) ◽  
pp. 393-424 ◽  
Author(s):  
MARCELO ARENAS ◽  
LEOPOLDO BERTOSSI ◽  
JAN CHOMICKI


Sign in / Sign up

Export Citation Format

Share Document