Mapping Referential Integrity Constraints from Relational Databases to XML

Author(s):  
Xiaochun Yang ◽  
Guoren Wang
2018 ◽  
Vol 15 (3) ◽  
pp. 821-843
Author(s):  
Jovana Vidakovic ◽  
Sonja Ristic ◽  
Slavica Kordic ◽  
Ivan Lukovic

A database management system (DBMS) is based on a data model whose concepts are used to express a database schema. Each data model has a specific set of integrity constraint types. There are integrity constraint types, such as key constraint, unique constraint and foreign key constraint that are supported by most DBMSs. Other, more complex constraint types are difficult to express and enforce and are mostly completely disregarded by actual DBMSs. The users have to manage those using custom procedures or triggers. eXtended Markup Language (XML) has become the universal format for representing and exchanging data. Very often XML data are generated from relational databases and exported to a target application or another database. In this context, integrity constraints play the essential role in preserving the original semantics of data. Integrity constraints have been extensively studied in the relational data model. Mechanisms provided by XML schema languages rely on a simple form of constraints that is sufficient neither for expressing semantic constraints commonly found in databases nor for expressing more complex constraints induced by the business rules of the system under study. In this paper we present a classification of constraint types in relational data model, discuss possible declarative mechanisms for their specification and enforcement in the XML data model, and illustrate our approach to the definition and enforcement of complex constraint types in the XML data model on the example of extended tuple constraint type.


Semantic Web ◽  
2021 ◽  
pp. 1-34
Author(s):  
David Chaves-Fraga ◽  
Edna Ruckhaus ◽  
Freddy Priyatna ◽  
Maria-Esther Vidal ◽  
Oscar Corcho

Ontology-Based Data Access (OBDA) has traditionally focused on providing a unified view of heterogeneous datasets (e.g., relational databases, CSV and JSON files), either by materializing integrated data into RDF or by performing on-the-fly querying via SPARQL query translation. In the specific case of tabular datasets represented as several CSV or Excel files, query translation approaches have been applied by considering each source as a single table that can be loaded into a relational database management system (RDBMS). Nevertheless, constraints over these tables are not represented (e.g., referential integrity among sources, datatypes, or data integrity); thus, neither consistency among attributes nor indexes over tables are enforced. As a consequence, efficiency of the SPARQL-to-SQL translation process may be affected, as well as the completeness of the answers produced during the evaluation of the generated SQL query. Our work is focused on applying implicit constraints on the OBDA query translation process over tabular data. We propose Morph-CSV, a framework for querying tabular data that exploits information from typical OBDA inputs (e.g., mappings, queries) to enforce constraints that can be used together with any SPARQL-to-SQL OBDA engine. Morph-CSV relies on both a constraint component and a set of constraint operators. For a given set of constraints, the operators are applied to each type of constraint with the aim of enhancing query completeness and performance. We evaluate Morph-CSV in several domains: e-commerce with the BSBM benchmark; transportation with the GTFS-Madrid benchmark; and biology with a use case extracted from the Bio2RDF project. We compare and report the performance of two SPARQL-to-SQL OBDA engines, without and with the incorporation of Morph-CSV. The observed results suggest that Morph-CSV is able to speed up the total query execution time by up to two orders of magnitude, while it is able to produce all the query answers.


Sign in / Sign up

Export Citation Format

Share Document