AN EFFICIENT TEST FOR THE VALIDITY OF UNBIASED HYBRID KNOWLEDGE FRAGMENTATION IN DISTRIBUTED DATABASES

Author(s):  
YANCHUN ZHANG ◽  
MARIA E. ORLOWSKA ◽  
ROBERT COLOMB

Knowledge bases contain specific and general knowledge. In relational database systems, specific knowledge is often represented as a set of relations. The conventional methodologies for centralized database design can be applied to develop a normalized, redundancy-free global schema. Distributed database design involves redundancy removal as well as the distribution design which allows replicated data segments. Thus, distribution design can be viewed as a process on a normalized global schema which produces a collection of fragments of relations from a global database. Clearly, not every fragment of data can be permitted as a relation. In this paper, we clarify and formally discuss three kinds of fragmentations of relational databases, and characterize their features as valid designs, and we introduce a hybrid knowledge fragmentation as the general case. For completeness of presentation, we first show an algorithm for the validity test of vertical fragmentations of normalized relations, and then extend it to the more general case of unbiased fragmentations.

2020 ◽  
Vol 26 (11) ◽  
pp. 1382-1401
Author(s):  
Izabela Rojek ◽  
Dariusz Mikołajewski ◽  
Piotr Kotlarz ◽  
Alžbeta Sapietová

This article presents the evolution of databases from classical relational databases to distributed databases and data warehouses to fuzzy databases used in a production enterprise. This paper discusses characteristics of this kind of enterprise. The authors precisely define centralized and distributed databases, data warehouses and fuzzy databases. In the modern global world, many companies change their management strategy from the one based on a centralized database to an approach based on distributed database systems. Growing expectations regarding business intelligence encourage companies to deploy data warehouses. New solutions are sought as the demand for engineers' expertise continues to rise. The requested knowledge can be certain or uncertain. Certain knowledge does not any problems and is easy to obtain. However, uncertain knowledge requires new ways of obtaining, including the use of fuzzy logic. It is from where the fuzzy database approach takes its beginning. The above-mentioned strategies of a production enterprise were described herein as a case of special interest.


Author(s):  
Berkay Aydin ◽  
Vijay Akkineni ◽  
Rafal A Angryk

With the ever-growing nature of spatiotemporal data, it is inevitable to use non-relational and distributed database systems for storing massive spatiotemporal datasets. In this chapter, the important aspects of non-relational (NoSQL) databases for storing large-scale spatiotemporal trajectory data are investigated. Mainly, two data storage schemata are proposed for storing trajectories, which are called traditional and partitioned data models. Additionally spatiotemporal and non-spatiotemporal indexing structures are designed for efficiently retrieving data under different usage scenarios. The results of the experiments exhibit the advantages of utilizing data models and indexing structures for various query types.


Author(s):  
MD. SHAZZAD HOSAIN ◽  
MUHAMMAD ABDUL HAKIM NEWTON

In this paper we present a multi-key index model that enables us to search a record with more than one attribute values in distributed database systems. Indices provide fast and efficient access of data and so become a major aspect in centralized database systems. Most of the centralized database systems use B + tree or other types of index structures such as bit vector, graph structure, grid file etc. But in distributed database systems no index model is found in the literature. Therefore efficient access is a major problem in distributed databases. Our proposed index model avoids the query-flooding problem of existing system and thus optimizes network bandwidth.


2002 ◽  
Vol 40 (1) ◽  
pp. 55-64
Author(s):  
Saran Akram Abd Al-Majeed

There has been a great deal of discussion about null values in relational databases. The relational model was defined in 1969, and Nulls Was died in 1979. Unfortunately, there is not a generally agreeable solution for rull values problem. Null is a special marker which stands for a value undefined or unknown, which means thut ne entry has been made, a missing valuc mark is not a value and not of a date type and cannot be treated as a value by Database Management System (DBMS). As we know, distributed database users are more than a single database and data will be distributed among several data sources or sites, it must be precise data, the replication is allowed there, so complex problems will appear, then there will be need for perfect practical general approaches for treatment of Nulls. A distributed database system is designed, that is "Hotel reservation control system, based on different data sources at four site, each site is represented as a Hotel, for more heterogeneity different application programming languages there are five practical approaches, designed with their rules and algorithms for Null values treatment through the distributed database sites. (1), (2), (3). 14). 15), (9).


2017 ◽  
Vol 15 (2) ◽  
pp. 61-72
Author(s):  
D O ABORISADE ◽  
A S SODIYA ◽  
A A ODUMOSU ◽  
O Y ALOWOSILE ◽  
A A ADEDEJI

Distributed Database Systems have been very useful technologies in making a wide range of information available to users across the World. However, there are now growing security concerns, arising from the use of distributed systems, particularly the ones attached to critical systems. More than ever before, data in distributed databases are more susceptible to attacks, failures or accidents owing to advanced knowledge explosions in network and database technologies. The imperfection of the existing security mechanisms coupled with the heightened and growing concerns for intrusion, attack, compromise or even failure owing to Byzantine failure are also contributing factors. The importance of  survivable distributed databases in the face of byzantine failure, to other emerging technologies is the motivation for this research. Furthermore, It has been observed that most of the existing works on distributed database only dwelled on maintaining data integrity and availability in the face of attack. There exist few on availability or survibability of distributed databases owing to internal factors such as internal sabotage or storage defects. In this paper, an architecture for entrenching survivability of Distributed Databases occasioned by Byzantine failures is proposed. The proposed architecture concept is based on re-creating data on failing database server based on a set  threshold value.The proposed architecture is tested and found to be capable of improving probability of survivability in distributed database where it is implemented to  99.6%  from 99.2%. 


Author(s):  
Changhong Jing ◽  
Wenjie Liu ◽  
Jintao Gao ◽  
Ouya Pei

Data processing can be roughly divided into two categories, online transaction processing OLTP(on-line transaction processing) and online analytical processing OLAP(on-line analytical processing). OLTP is the main application of traditional relational databases, and it is some basic daily transaction processing, such as bank pipeline transactions and so on. OLAP is the main application of the data warehouse system, it supports some more complex data analysis operations, focuses on decision support, and provides popular and intuitive analysis results. As the amount of data processed by enterprises continues to increase, distributed databases have gradually replaced stand-alone databases and become the mainstream of applications. However, the current business supported by distributed databases is mainly based on OLTP applications, lacking OLAP implementation. This paper proposes an implementation method of HTAP for distributed database CBase, which provides an implementation method of OLAP analysis for CBase, and can easily deal with data analysis of large amounts of data.


Author(s):  
Rebecca Nyasuguta Arika ◽  
W. Cheruiyot

Transaction commit protocols help in reaching an agreement among the participating nodes when a transaction has to be committed or aborted. To initiate an agreement each participating node is asked to vote its decision on the operations on its transactional fragment. The participating nodes can decide to either commit or abort an ongoing transaction. In case of a node failure, the active participants take essential steps such as running the termination protocol to preserve database correctness. This paper sought to investigate the current distributed databases commit protocols such as 2PC and 3PC in order to pin-point their shortcomings. For instance, 2PC suffers from blocking of participant site in case of coordinator failure and increased latency due to forced writes of logs. On its part, 3PC suffers more communication overhead due to extra pre-commit phase. Based on these setbacks, an efficient protocol is suggested towards the end of this paper that it believed to address some of the challenges such as blocking and extra message exchange between communicating nodes.


2014 ◽  
Vol 13 (9) ◽  
pp. 4859-4867
Author(s):  
Khaled Saleh Maabreh

Distributed database management systems manage a huge amount of data as well as large and increasingly growing number of users through different types of queries. Therefore, efficient methods for accessing these data volumes will be required to provide a high and an acceptable level of system performance.  Data in these systems are varying in terms of types from texts to images, audios and videos that must be available through an optimized level of replication. Distributed database systems have many parameters like data distribution degree, operation mode and the number of sites and replication. These parameters have played a major role in any performance evaluation study. This paper investigates the main parameters that may affect the system performance, which may help with configuring the distributed database system for enhancing the overall system performance.


Sign in / Sign up

Export Citation Format

Share Document