Towards a Normal Form and a Query Language for Extended Relations Defined by Regular Expressions

2016 ◽  
Vol 27 (2) ◽  
pp. 27-48
Author(s):  
András Benczúr ◽  
Gyula I. Szabó

This paper introduces a generalized data base concept that unites relational and semi structured data models. As an important theoretical result we could find a quadratic decision algorithm for the implication problem of functional and join dependencies defined on the united data model. As practical contribution we presented a normal form for the new data model as a tool for data base design. With our novel representations of regular expressions, a more effective searching method could be developed. XML elements are described by XML schema languages such as a DTD or an XML Schema definition. The instances of these elements are semi-structured tuples. A semi-structured tuple is an ordered list of (attribute: value) pairs. We may think of a semi-structured tuple as a sentence of a formal language, where the values are the terminal symbols and the attribute names are the non-terminal symbols. In the authors' former work (Szabó and Benczúr, 2015) they introduced the notion of the extended tuple as a sentence from a regular language generated by a grammar where the non-terminal symbols of the grammar are the attribute names of the tuple. Sets of extended tuples are the extended relations. The authors then introduced the dual language, which generates the tuple types allowed to occur in extended relations. They defined functional dependencies (regular FD - RFD) over extended relations. In this paper they rephrase the RFD concept by directly using regular expressions over attribute names to define extended tuples. By the help of a special vertex labeled graph associated to regular expressions the specification of substring selection for the projection operation can be defined. The normalization for regular schemas is more complex than it is in the relational model, because the schema of an extended relation can contain an infinite number of tuple types. However, the authors can define selection, projection and join operations on extended relations too, so a lossless-join decomposition can be performed. They extended their previous model to deal with XML schema indicators too, e.g., with numerical constraints. They added line and set constructors too, in order to extend their model with more general projection and selection operators. This model establishes a query language with table join functionality for collected XML element data.

Author(s):  
Devendra K. Tayal ◽  
P. C. Saxena

In this paper we discuss an important integrity constraint called multivalued dependency (mvd), which occurs as a result of the first normal form, in the framework of a newly proposed model called fuzzy multivalued relational data model. The fuzzy multivalued relational data model proposed in this paper accommodates a wider class of ambiguities by representing the domain of attributes as a “set of fuzzy subsets”. We show that our model is able to represent multiple types of impreciseness occurring in the real world. To compute the equality of two fuzzy sets/values (which occur as tuple-values), we use the concept of fuzzy functions. So the main objective of this paper is to extend the mvds in context of fuzzy multivalued relational model so that a wider class of impreciseness can be captured. Since the mvds may not exist in isolation, a complete axiomatization for a set of fuzzy functional dependencies (ffds) and mvds in fuzzy multivalued relational schema is provided and the role of fmvds in obtaining the lossless join decomposition is discussed. We also provide a set of sound Inference Rules for the fmvds and derive the conditions for these Inference Rules to be complete. We also derive the conditions for obtaining the lossless join decomposition of a fuzzy multivalued relational schema in the presence of the fmvds. Finally we extend the ABU's Algorithm to find the lossless join decomposition in context of fuzzy multivalued relational databases. We apply all of the concepts of fmvds developed by us to a real world application of “Technical Institute” and demonstrate that how the concepts fit well to capture the multiple types of impreciseness.


2016 ◽  
Vol 64 (3) ◽  
pp. 457-466 ◽  
Author(s):  
A. Czerepicki

Abstract The article presents an innovative concept of applying graph databases in transport information systems. The model of a graph database has been presented together with implementation of data structures and search operations in a graph. The transformation concept of relational model to a graph data model has been developed. The schema of graph database has been proposed for public transport information system purposes. The realization methods have been illustrated by the use of search function based on the Cypher query language.


Author(s):  
Antonio Badia

The relational data model is the dominant paradigm in the commercial database market today, and it has been for several years. However, there have been challenges to the model over the years, and they have influenced its evolution and that of database technology. The object-oriented revolution that got started in programming languages arrived to the database area in the form of a brand new data model. The relational model managed not only to survive the newcomer but to continue becoming a dominant force, transformed into the object-relational model (also called extended relational, or universal) and relegating object-oriented databases to a niche product. Although this market has many nontechnical aspects, there are certainly important technical differences among the mentioned data models. In this article I describe the basic components of the relational, object-oriented, and object-relational data models. I do not, however, discuss query language, implementation, or system issues. A basic comparison is given and then future trends are discussed.


2021 ◽  
Author(s):  
◽  
Van Tran Bao Le

<p>A database is said to be C-Armstrong for a finite set Σ of data dependencies in a class C if the database satisfies all data dependencies in Σ and violates all data dependencies in C that are not implied by Σ. Therefore, Armstrong databases are concise, user-friendly representations of abstract data dependencies that can be used to judge, justify, convey, and test the understanding of database design choices. Indeed, an Armstrong database satisfies exactly those data dependencies that are considered meaningful by the current design choice Σ. Structural and computational properties of Armstrong databases have been deeply investigated in Codd’s Turing Award winning relational model of data. Armstrong databases have been incorporated in approaches towards relational database design. They have also been found useful for the elicitation of requirements, the semantic sampling of existing databases, and the specification of schema mappings. This research establishes a toolbox of Armstrong databases for SQL data. This is challenging as SQL data can contain null marker occurrences in columns declared NULL, and may contain duplicate rows. Thus, the existing theory of Armstrong databases only applies to idealized instances of SQL data, that is, instances without null marker occurrences and without duplicate rows. For the thesis, two popular interpretations of null markers are considered: the no information interpretation used in SQL, and the exists but unknown interpretation by Codd. Furthermore, the study is limited to the popular class C of functional dependencies. However, the presence of duplicate rows means that the class of uniqueness constraints is no longer subsumed by the class of functional dependencies, in contrast to the relational model of data. As a first contribution a provably-correct algorithm is developed that computes Armstrong databases for an arbitrarily given finite set of uniqueness constraints and functional dependencies. This contribution is based on axiomatic, algorithmic and logical characterizations of the associated implication problem that are also established in this thesis. While the problem to decide whether a given database is Armstrong for a given set of such constraints is precisely exponential, our algorithm computes an Armstrong database with a number of rows that is at most quadratic in the number of rows of a minimum-sized Armstrong database. As a second contribution the algorithms are implemented in the form of a design tool. Users of the tool can therefore inspect Armstrong databases to analyze their current design choice Σ. Intuitively, Armstrong databases are useful for the acquisition of semantically meaningful constraints, if the users can recognize the actual meaningfulness of constraints that they incorrectly perceived as meaningless before the inspection of an Armstrong database. As a final contribution, measures are introduced that formalize the term “useful” and it is shown by some detailed experiments that Armstrong tables, as computed by the tool, are indeed useful. In summary, this research establishes a toolbox of Armstrong databases that can be applied by database designers to concisely visualize constraints on SQL data. Such support can lead to database designs that guarantee efficient data management in practice.</p>


Author(s):  
Ranko Vujosevic ◽  
Andrew Kusiak

Abstract The data base requirements for concurrent design systems are discussed. An object-oriented data base, which allows for definition of complex objects, specification of relationships between objects, and modular expandability without affecting the existing information is defined. The data base is developed based on the object-oriented data model implemented in Smalltalk-80. An assumption-based truth maintenance system for maintaining the dependency relationships between design and manufacturing information is described.


2013 ◽  
Vol 753-755 ◽  
pp. 3112-3115
Author(s):  
Jing Li Huang ◽  
Qing Wang ◽  
Qiu Ling Lang

The Three-dimensional engineering geology data warehouse is constructed by Power Desinger16.1, with the theme as the rock and mass availability in urban underground space, and with the source data as the borehole data of engineering Investigation. Use the Model-driven Architecture method, reverse engineer the Access data base, extract existed data model, combine research theme to construct the Star data structure model. And check the SQL script in SQL Server2005, to ensure normal operation. 0 Forewords The traditional transaction-oriented designed engineering geology data base has the function to storage original data from work, to draw of geological section and to provide simple check and analysis, but without the decision support function in view of a subject. The purpose of construction a 3D engineering geological data warehouse is to build a decision support system in view of availability of rock and soil mass in urban underground space. Based on the data extraction, data integration, data cleaning and data transformation, the 3D engineering geological data warehouse could achieve the integrated management of massive geological data and to provide reliable data source for the rock and soil mass utilization system in urban underground space. The main feature of 3D engineering geological data base is subject-oriented, integrated, time-varying, relatively stable, and is magnanimous collection of engineering geological spatial data and attribute data. According to the design pattern of traditional data base, the construction of 3D engineering geological data warehouse can be divided into three stages: concept design model, logic design model and physical design model. But the 3D engineering geological data warehouse exist iterative in the construction process. Currently, there are many CASE tools to help developers quickly achieving the data base design, such as Rational Rose by Rational company, Erwin and Bpwin by CA company, Power Designer by Sybase company, Office Visio by Microsoft company, and Oracle Designer by Oracle company. The paper uses the Powerdesigner16.1 to achieve the logical data model (LDM) and physical data model (PDM).


2019 ◽  
Vol 8 (3) ◽  
pp. 7753-7758

The article presents an adaptable data model based on multidimensional space. The main difference between a multidimensional data representation and a table representation used in relational Database Management Systems (DBMSs) is that it is possible to add new elements to sets defining the axes of multidimensional space at any time. This changes the data model. The tabular representation of the relational model does not allow you to change the model itself during the operation of an automated system. Three levels of multidimensional data presentation space are considered. There are axis of multidimensional space, the Cartesian product of the sets of axis values and the values of space points. The five axes of multidimensional space defined in the article (entities, attributes, identifiers, time, modifiers) are basic for the design of an adaptable automated system. It is shown that it is possible to use additional axes for greater granularity of the stored data. The multidimensional space structure defined in the article for an adaptable data model is a flexible set for storing a relational domain model. Two types of operations in multidimensional information space are defined. Relations of the relational model are formed dynamically depending on the conditions imposed on the coordinates of the points. Thus, an adaptable data representation model based on multidimensional space can be used to create flexible dynamic automated information systems.


Sign in / Sign up

Export Citation Format

Share Document