The implication problem of data dependencies over SQL table definitions

Integrity constraints such as functional dependencies (FD) and multi-valued dependencies (MVD) are fundamental in database schema design. Likewise, probabilistic conditional independences (CI) are crucial for reasoning about multivariate probability distributions. The implication problem studies whether a set of constraints (antecedents) implies another constraint (consequent), and has been investigated in both the database and the AI literature, under the assumption that all constraints hold exactly. However, many applications today consider constraints that hold only approximately. In this paper we define an approximate implication as a linear inequality between the degree of satisfaction of the antecedents and consequent, and we study the relaxation problem: when does an exact implication relax to an approximate implication? We use information theory to define the degree of satisfaction, and prove several results. First, we show that any implication from a set of data dependencies (MVDs+FDs) can be relaxed to a simple linear inequality with a factor at most quadratic in the number of variables; when the consequent is an FD, the factor can be reduced to 1. Second, we prove that there exists an implication between CIs that does not admit any relaxation; however, we prove that every implication between CIs relaxes "in the limit". Then, we show that the implication problem for differential constraints in market basket analysis also admits a relaxation with a factor equal to 1. Finally, we show how some of the results in the paper can be derived using the I-measure theory, which relates between information theoretic measures and set theory. Our results recover, and sometimes extend, previously known results about the implication problem: the implication of MVDs and FDs can be checked by considering only 2-tuple relations.

Download Full-text

On the Discovery of Semantically Meaningful SQL Constraints from Armstrong Samples: Foundations, Implementation, and Evaluation

10.26686/wgtn.17007619 ◽

2021 ◽

Author(s):

◽

Van Tran Bao Le

Keyword(s):

Database Design ◽

Design Tool ◽

Relational Model ◽

Functional Dependencies ◽

Data Dependencies ◽

Design Choice ◽

Implication Problem ◽

Finite Set ◽

Current Design ◽

Class C

<p>A database is said to be C-Armstrong for a finite set Σ of data dependencies in a class C if the database satisfies all data dependencies in Σ and violates all data dependencies in C that are not implied by Σ. Therefore, Armstrong databases are concise, user-friendly representations of abstract data dependencies that can be used to judge, justify, convey, and test the understanding of database design choices. Indeed, an Armstrong database satisfies exactly those data dependencies that are considered meaningful by the current design choice Σ. Structural and computational properties of Armstrong databases have been deeply investigated in Codd’s Turing Award winning relational model of data. Armstrong databases have been incorporated in approaches towards relational database design. They have also been found useful for the elicitation of requirements, the semantic sampling of existing databases, and the specification of schema mappings. This research establishes a toolbox of Armstrong databases for SQL data. This is challenging as SQL data can contain null marker occurrences in columns declared NULL, and may contain duplicate rows. Thus, the existing theory of Armstrong databases only applies to idealized instances of SQL data, that is, instances without null marker occurrences and without duplicate rows. For the thesis, two popular interpretations of null markers are considered: the no information interpretation used in SQL, and the exists but unknown interpretation by Codd. Furthermore, the study is limited to the popular class C of functional dependencies. However, the presence of duplicate rows means that the class of uniqueness constraints is no longer subsumed by the class of functional dependencies, in contrast to the relational model of data. As a first contribution a provably-correct algorithm is developed that computes Armstrong databases for an arbitrarily given finite set of uniqueness constraints and functional dependencies. This contribution is based on axiomatic, algorithmic and logical characterizations of the associated implication problem that are also established in this thesis. While the problem to decide whether a given database is Armstrong for a given set of such constraints is precisely exponential, our algorithm computes an Armstrong database with a number of rows that is at most quadratic in the number of rows of a minimum-sized Armstrong database. As a second contribution the algorithms are implemented in the form of a design tool. Users of the tool can therefore inspect Armstrong databases to analyze their current design choice Σ. Intuitively, Armstrong databases are useful for the acquisition of semantically meaningful constraints, if the users can recognize the actual meaningfulness of constraints that they incorrectly perceived as meaningless before the inspection of an Armstrong database. As a final contribution, measures are introduced that formalize the term “useful” and it is shown by some detailed experiments that Armstrong tables, as computed by the tool, are indeed useful. In summary, this research establishes a toolbox of Armstrong databases that can be applied by database designers to concisely visualize constraints on SQL data. Such support can lead to database designs that guarantee efficient data management in practice.</p>

Download Full-text

A Survey of Software-Defined Networks-on-Chip: Motivations, Challenges and Opportunities

Micromachines ◽

10.3390/mi12020183 ◽

2021 ◽

Vol 12 (2) ◽

pp. 183

Author(s):

Jose Ricardo Gomez-Rodriguez ◽

Remberto Sandoval-Arechiga ◽

Salvador Ibarra-Delgado ◽

Viktor Ivan Rodriguez-Abdala ◽

Jose Luis Vazquez-Avila ◽

...

Keyword(s):

Single Chip ◽

Synthesis Time ◽

Networks On Chip ◽

Data Dependencies ◽

Layered Architecture ◽

Systems On Chip ◽

Challenges And Opportunities ◽

Computing Platforms ◽

On Chip ◽

Many Core

Current computing platforms encourage the integration of thousands of processing cores, and their interconnections, into a single chip. Mobile smartphones, IoT, embedded devices, desktops, and data centers use Many-Core Systems-on-Chip (SoCs) to exploit their compute power and parallelism to meet the dynamic workload requirements. Networks-on-Chip (NoCs) lead to scalable connectivity for diverse applications with distinct traffic patterns and data dependencies. However, when the system executes various applications in traditional NoCs—optimized and fixed at synthesis time—the interconnection nonconformity with the different applications’ requirements generates limitations in the performance. In the literature, NoC designs embraced the Software-Defined Networking (SDN) strategy to evolve into an adaptable interconnection solution for future chips. However, the works surveyed implement a partial Software-Defined Network-on-Chip (SDNoC) approach, leaving aside the SDN layered architecture that brings interoperability in conventional networking. This paper explores the SDNoC literature and classifies it regarding the desired SDN features that each work presents. Then, we described the challenges and opportunities detected from the literature survey. Moreover, we explain the motivation for an SDNoC approach, and we expose both SDN and SDNoC concepts and architectures. We observe that works in the literature employed an uncomplete layered SDNoC approach. This fact creates various fertile areas in the SDNoC architecture where researchers may contribute to Many-Core SoCs designs.

Download Full-text

Control and data dependencies in business processes based on semantic business activities

Proceedings of the 10th International Conference on Information Integration and Web-based Applications & Services - iiWAS '08 ◽

10.1145/1497308.1497356 ◽

2008 ◽

Cited By ~ 9

Author(s):

Zhangbing Zhou ◽

Sami Bhiri ◽

Manfred Hauswirth

Keyword(s):

Business Processes ◽

Data Dependencies

Download Full-text

Hybrid Software Redundancy Approach for Building Reliable Communication in Multi-BUS Heterogeneous Systems

International Journal of Reliability Quality and Safety Engineering ◽

10.1142/s0218539316500133 ◽

2016 ◽

Vol 23 (04) ◽

pp. 1650013

Author(s):

Chafik Arar ◽

Mohamed Salah Khireddine

Keyword(s):

Fault Tolerant ◽

Scheduling Algorithm ◽

Heterogeneous Systems ◽

Reliable Communication ◽

Data Dependencies ◽

Data Scheduling ◽

Static Scheduling ◽

Heterogeneous Architectures ◽

Multiple Processors ◽

Hardware Faults

The paper proposes a new reliable fault-tolerant scheduling algorithm for real-time embedded systems. The proposed algorithm is based on static scheduling that allows to include the dependencies and the execution cost of tasks and data dependencies in its scheduling decisions. Our scheduling algorithm is dedicated to multi-bus heterogeneous architectures with multiple processors linked by several shared buses. This scheduling algorithm is considering only one bus fault caused by hardware faults and compensated by software redundancy solutions. The proposed algorithm is based on both active and passive backup copies to minimize the scheduling length of data on buses. In the experiments, the proposed methods are evaluated in terms of data scheduling length for a set of DSP benchmarks. The experimental results show the effectiveness of our technique.

Download Full-text

User privacy-preserving identity data dependencies

Proceedings of the second ACM workshop on Digital identity management - DIM '06 ◽

10.1145/1179529.1179537 ◽

2006 ◽

Author(s):

Samir Saklikar ◽

Subir Saha

Keyword(s):

Privacy Preserving ◽

User Privacy ◽

Data Dependencies

Download Full-text

A product data dependencies network to support conflict resolution in design processes

The Proceedings of the Multiconference on "Computational Engineering in Systems Applications" ◽

10.1109/cesa.2006.4281822 ◽

2006 ◽

Author(s):

M. Z. Ouertani ◽

K. Grebici ◽

L. Gzara-Yesilbas ◽

E. Blanco ◽

D. Rieu

Keyword(s):

Conflict Resolution ◽

Product Data ◽

Design Processes ◽

Data Dependencies

Download Full-text

Conflict Resolution in Concurrent Engineering Processes

6th International Conference on Design Theory and Methodology ◽

10.1115/detc1994-0015 ◽

1994 ◽

Author(s):

Srikanth M. Kannapan ◽

Dean L. Taylor

Keyword(s):

Pressure Sensor ◽

Concurrent Engineering ◽

Information Model ◽

Sensor Design ◽

Multiple Perspectives ◽

Product Model ◽

Micro Electro Mechanical Systems ◽

Team Members ◽

Data Dependencies ◽

Design Project

Abstract Naive interpretations of concurrent engineering may expect extreme parallelization of tasks and simultaneous accommodation of multiple perspectives. In fact, from our efforts at modeling tasks in a MEMS (Micro-Electro-Mechanical Systems) pressure sensor design project, it appears that data dependencies due to the structure of tasks and the product itself result in scenarios of decision and action that must be carefully coordinated. This paper refines a previously described information model for defining evolving contexts of product model aspects and team member perspectives, with software agents acting on behalf of team members to execute tasks. The pressure sensor design project is analyzed in the framework of the information model. A scenario of decision and action for design of the pressure sensor is modeled as a design process plan. Conflict on a shared parameter occurs as a consequence of introducing some parallelism between the capacitance and deflection agents in the process. We present a technique for negotiating such conflicts by definition and propagation of utility functions on decision parameters and axiomatic negotiation.

Download Full-text

Comparison of GOME-2/MetOp total ozone data with Brewer spectroradiometer data over the Iberian Peninsula

Annales Geophysicae ◽

10.5194/angeo-27-1377-2009 ◽

2009 ◽

Vol 27 (4) ◽

pp. 1377-1386 ◽

Cited By ~ 36

Author(s):

M. Antón ◽

D. Loyola ◽

M. López ◽

J. M. Vilaplana ◽

M. Bañón ◽

...

Keyword(s):

Iberian Peninsula ◽

Total Ozone ◽

A Priori ◽

Retrieval Algorithm ◽

Data Dependencies ◽

Ozone Data ◽

Significant Dependence ◽

Ozone Profile ◽

Lower Variability ◽

Relative Differences

Abstract. The main objective of this article is to compare the total ozone data from the new Global Ozone Monitoring Experiment instrument (GOME-2/MetOp) with reliable ground-based measurement recorded by five Brewer spectroradiometers in the Iberian Peninsula. In addition, a similar comparison for the predecessor instrument GOME/ERS-2 is described. The period of study is a whole year from May 2007 to April 2008. The results show that GOME-2/MetOp ozone data already has a very good quality, total ozone columns are on average 3.05% lower than Brewer measurements. This underestimation is higher than that obtained for GOME/ERS-2 (1.46%). However, the relative differences between GOME-2/MetOp and Brewer measurements show significantly lower variability than the differences between GOME/ERS-2 and Brewer data. Dependencies of these relative differences with respect to the satellite solar zenith angle (SZA), the satellite scan angle, the satellite cloud cover fraction (CF), and the ground-based total ozone measurements are analyzed. For both GOME instruments, differences show no significant dependence on SZA. However, GOME-2/MetOp data show a significant dependence on the satellite scan angle (+1.5%). In addition, GOME/ERS-2 differences present a clear dependence with respect to the CF and ground-based total ozone; such differences are minimized for GOME-2/MetOp. The comparison between the daily total ozone values provided by both GOME instruments shows that GOME-2/MetOp ozone data are on average 1.46% lower than GOME/ERS-2 data without any seasonal dependence. Finally, deviations of a priori climatological ozone profile used by the satellite retrieval algorithm from the true ozone profile are analyzed. Although excellent agreement between a priori climatological and measured partial ozone values is found for the middle and high stratosphere, relative differences greater than 15% are common for the troposphere and lower stratosphere.

Download Full-text