Fuzzy Functional Dependencies (FFDs) as Integrity Constraints

1998 ◽  
pp. 119-134
Author(s):  
Guoqing Chen
2020 ◽  
Vol 13 (10) ◽  
pp. 1682-1695
Author(s):  
Ester Livshits ◽  
Alireza Heidari ◽  
Ihab F. Ilyas ◽  
Benny Kimelfeld

The problem of mining integrity constraints from data has been extensively studied over the past two decades for commonly used types of constraints, including the classic Functional Dependencies (FDs) and the more general Denial Constraints (DCs). In this paper, we investigate the problem of mining from data approximate DCs, that is, DCs that are "almost" satisfied. Approximation allows us to discover more accurate constraints in inconsistent databases and detect rules that are generally correct but may have a few exceptions. It also allows to avoid overfitting and obtain constraints that are more general, more natural, and less contrived. We introduce the algorithm ADCMiner for mining approximate DCs. An important feature of this algorithm is that it does not assume any specific approximation function for DCs, but rather allows for arbitrary approximation functions that satisfy some natural axioms that we define in the paper. We also show how our algorithm can be combined with sampling to return highly accurate results considerably faster.


2009 ◽  
pp. 2051-2058
Author(s):  
Luciano Caroprese ◽  
Cristian Molinaro ◽  
Irina Trubitsyna ◽  
Ester Zumpano

Integrating data from different sources consists of two main steps, the first in which the various relations are merged together, and the second in which some tuples are removed (or inserted) from the resulting database in order to satisfy integrity constraints. There are several ways to integrate databases or possibly distributed information sources, but whatever integration architecture we choose, the heterogeneity of the sources to be integrated causes subtle problems. In particular, the database obtained from the integration process may be inconsistent with respect to integrity constraints, that is, one or more integrity constraints are not satisfied. Integrity constraints represent an important source of information about the real world. They are usually used to define constraints on data (functional dependencies, inclusion dependencies, etc.) and have, nowadays, a wide applicability in several contexts such as semantic query optimization, cooperative query answering, database integration, and view update.


Author(s):  
Sergio Flesca ◽  
Sergio Greco ◽  
Ester Zumpano

Integrity constraints are a fundamental part of a database schema. They are generally used to define constraints on data (functional dependencies, inclusion dependencies, exclusion dependencies, etc.), and their enforcement ensures a semantically correct state of a database. As the presence of data inconsistent with respect to integrity constraints is not unusual, its management plays a key role in all the areas in which duplicate or conflicting information is likely to occur, such as database integration, data warehousing, and federated databases (Bry, 1997; Lin, 1996; Subrahmanian, 1994). It is well known that the presence of inconsistent data can be managed by “repairing” the database, that is, by providing consistent databases, obtained by a minimal set of update operations on the inconsistent original environment, or by consistently answering queries posed over the inconsistent database.


Author(s):  
Charalampos Nikolaou ◽  
Bernardo Cuenca Grau ◽  
Egor V. Kostylev ◽  
Mark Kaminski ◽  
Ian Horrocks

We extend ontology-based data access with integrity constraints over both the source and target schemas. The relevant reasoning problems in this setting are constraint satisfaction—to check whether a database satisfies the target constraints given the mappings and the ontology—and source-to-target (resp., target-to-source) constraint implication, which is to check whether a target constraint (resp., a source constraint) is satisfied by each database satisfying the source constraints (resp., the target constraints). We establish decidability and complexity bounds for all these problems in the case where ontologies are expressed in DL-LiteR and constraints range from functional dependencies to disjunctive tuple-generating dependencies.


2002 ◽  
pp. 172-202
Author(s):  
Sergio Greco ◽  
Ester Zumpano

Integrity constraints represent an important source of information about the real world. They are usually used to define constraints on data (functional dependencies, inclusion dependencies, etc.). Nowadays integrity constraints have a wide applicability in several contexts such as semantic query optimization, cooperative query answering, database integration and view update. Often databases may be inconsistent with respect to integrity constraints, that is, one or more integrity constraints are not satisfied. This may happen, for instance, when the database is obtained from the integration of different information sources. The integration of knowledge from multiple sources is an important aspect in several areas such as data warehousing, database integration, automated reasoning systems and active reactive databases.


2017 ◽  
Vol 1 (1) ◽  
pp. 04 ◽  
Author(s):  
Jaroslav Pokorny

Comparing graph databases with traditional,e.g., relational databases, some important database features are often missing there. Particularly, a graph database schema including integrity constraints is mostly not explicitly defined, also a conceptual modelling is not used. It is hard to check a consistency of the graph database, because almost no integrity constraints are defined or only their very simple representatives can be specified. In the paper, we discuss these issues and present current possibilities and challenges in graph database modelling. We focus also on integrity constraints modelling and propose functional dependencies between entity types, which reminds modelling functional dependencies known from relational databases. We show a number of examples of often cited GDBMSs and their approach to database schemas and ICs specification. Also a conceptual level of a graph database design is considered. We propose a sufficient conceptual model based on a binary variant of the ER model and show its relationship to a graph database model, i.e. a mapping conceptual schemas to database schemas. An alternative based on the conceptual functions called attributes is presented.  This is an Open Access article distributed under the terms of the Creative Commons Attribution License (http://creativecommons.org/licenses/by/4.0/), which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited.


2022 ◽  
Vol Volume 18, Issue 1 ◽  
Author(s):  
Batya Kenig ◽  
Dan Suciu

Integrity constraints such as functional dependencies (FD) and multi-valued dependencies (MVD) are fundamental in database schema design. Likewise, probabilistic conditional independences (CI) are crucial for reasoning about multivariate probability distributions. The implication problem studies whether a set of constraints (antecedents) implies another constraint (consequent), and has been investigated in both the database and the AI literature, under the assumption that all constraints hold exactly. However, many applications today consider constraints that hold only approximately. In this paper we define an approximate implication as a linear inequality between the degree of satisfaction of the antecedents and consequent, and we study the relaxation problem: when does an exact implication relax to an approximate implication? We use information theory to define the degree of satisfaction, and prove several results. First, we show that any implication from a set of data dependencies (MVDs+FDs) can be relaxed to a simple linear inequality with a factor at most quadratic in the number of variables; when the consequent is an FD, the factor can be reduced to 1. Second, we prove that there exists an implication between CIs that does not admit any relaxation; however, we prove that every implication between CIs relaxes "in the limit". Then, we show that the implication problem for differential constraints in market basket analysis also admits a relaxation with a factor equal to 1. Finally, we show how some of the results in the paper can be derived using the I-measure theory, which relates between information theoretic measures and set theory. Our results recover, and sometimes extend, previously known results about the implication problem: the implication of MVDs and FDs can be checked by considering only 2-tuple relations.


2020 ◽  
pp. 9-13
Author(s):  
A. V. Lapko ◽  
V. A. Lapko

An original technique has been justified for the fast bandwidths selection of kernel functions in a nonparametric estimate of the multidimensional probability density of the Rosenblatt–Parzen type. The proposed method makes it possible to significantly increase the computational efficiency of the optimization procedure for kernel probability density estimates in the conditions of large-volume statistical data in comparison with traditional approaches. The basis of the proposed approach is the analysis of the optimal parameter formula for the bandwidths of a multidimensional kernel probability density estimate. Dependencies between the nonlinear functional on the probability density and its derivatives up to the second order inclusive of the antikurtosis coefficients of random variables are found. The bandwidths for each random variable are represented as the product of an undefined parameter and their mean square deviation. The influence of the error in restoring the established functional dependencies on the approximation properties of the kernel probability density estimation is determined. The obtained results are implemented as a method of synthesis and analysis of a fast bandwidths selection of the kernel estimation of the two-dimensional probability density of independent random variables. This method uses data on the quantitative characteristics of a family of lognormal distribution laws.


2011 ◽  
Vol 21 (SI) ◽  
pp. 95-123 ◽  
Author(s):  
François Pinet ◽  
Magali Duboisset ◽  
Michel Schneider

2021 ◽  
Vol 22 (1) ◽  
Author(s):  
Rebecca Davies ◽  
Ling Liu ◽  
Sheng Taotao ◽  
Natasha Tuano ◽  
Richa Chaturvedi ◽  
...  

Abstract Introduction Genes contain multiple promoters that can drive the expression of various transcript isoforms. Although transcript isoforms from the same gene could have diverse and non-overlapping functions, current loss-of-function methodologies are not able to differentiate between isoform-specific phenotypes. Results Here, we show that CRISPR interference (CRISPRi) can be adopted for targeting specific promoters within a gene, enabling isoform-specific loss-of-function genetic screens. We use this strategy to test functional dependencies of 820 transcript isoforms that are gained in gastric cancer (GC). We identify a subset of GC-gained transcript isoform dependencies, and of these, we validate CIT kinase as a novel GC dependency. We further show that some genes express isoforms with opposite functions. Specifically, we find that the tumour suppressor ZFHX3 expresses an isoform that has a paradoxical oncogenic role that correlates with poor patient outcome. Conclusions Our work finds isoform-specific phenotypes that would not be identified using current loss-of-function approaches that are not designed to target specific transcript isoforms.


Sign in / Sign up

Export Citation Format

Share Document