Query Computation Time and Its Performance over Distributed Database Frameworks

It is essential to maintain a relevant methodology for data fragmentation to employ resources, and thus, it needs to choose an accurate and efficient fragmentation methodology to improve authority of distributed database system. This leads the challenges on data reliability, stable storage space and costs, Communication costs, and security issues. In Distributed database framework, query computation and data privacy plays a vital role over portioned distributed databases such as vertical, horizontal and hybrid models, Privacy of any information is regarded as the essential issue in nowadays hence we show an approach by that we can use privacy preservation over the two parties which are actually distributing their data horizontally or vertically. In this chapter, I present an approach by which the concept of hierarchal clustering applied over the horizontally partitioned data set. We also explain the desired algorithm like hierarchal clustering, algorithms for finding the minimum closest cluster. Furthermore, it explores the performance of Query Computation over portioned databases with the analysis of Efficiency and Privacy.

Download Full-text

MULTI-KEY INDEX FOR DISTRIBUTED DATABASE SYSTEM

International Journal of Software Engineering and Knowledge Engineering ◽

10.1142/s0218194005002075 ◽

2005 ◽

Vol 15 (02) ◽

pp. 433-438

Author(s):

MD. SHAZZAD HOSAIN ◽

MUHAMMAD ABDUL HAKIM NEWTON

Keyword(s):

Distributed Databases ◽

Database Systems ◽

Distributed Database ◽

Index Model ◽

Network Bandwidth ◽

Distributed Database Systems ◽

Efficient Access ◽

Distributed Database System ◽

Bit Vector ◽

Centralized Database

In this paper we present a multi-key index model that enables us to search a record with more than one attribute values in distributed database systems. Indices provide fast and efficient access of data and so become a major aspect in centralized database systems. Most of the centralized database systems use B + tree or other types of index structures such as bit vector, graph structure, grid file etc. But in distributed database systems no index model is found in the literature. Therefore efficient access is a major problem in distributed databases. Our proposed index model avoids the query-flooding problem of existing system and thus optimizes network bandwidth.

Download Full-text

PcAVP - Adaptable Vertical Partitioning Algorithm Based on Privacy Constraint for Distributed Database System

Applied Mechanics and Materials ◽

10.4028/www.scientific.net/amm.303-306.2139 ◽

2013 ◽

Vol 303-306 ◽

pp. 2139-2143

Author(s):

Jie Jiang ◽

Zhu Yan Gu ◽

Tie Ming Chen

Keyword(s):

Privacy Protection ◽

Data Privacy ◽

Distributed Database ◽

Constraint Checking ◽

Vertical Partitioning ◽

Data Privacy Protection ◽

Partitioning Algorithm ◽

Distributed Database System ◽

Specific Number ◽

Best Fit

Vertical partitioning is a process of generating fragments, each of which is composed of attributes with high affinity. It is widely used in the distributed database to improve the efficiency of system by reducing the connection between the table access operations. The current research on vertical partitioning is mainly focused on how to measure the "affinity" to get the best-fit vertical partitioning and the n-way vertical partitioning which support generating the specific number of fragments required by the user. In this paper, we propose a vertical partitioning algorithm based on privacy constraint. It supports both the best-fit vertical partitioning and the n-way vertical partitioning. It also provides the data privacy protection by privacy constraint checking. We conduct several experimental results to show that our algorithm not only keeps higher efficiency, but also provides data privacy protection.

Download Full-text

Distributed database system of the New Atlas of Amphibians and Reptiles in Europe: the NA2RE project

Amphibia-Reptilia ◽

10.1163/15685381-00002936 ◽

2014 ◽

Vol 35 (1) ◽

pp. 33-39 ◽

Cited By ~ 10

Author(s):

Neftalí Sillero ◽

Marco Amaro Oliveira ◽

Pedro Sousa ◽

Fátima Sousa ◽

Luís Gonçalves-Seco

Keyword(s):

Spatial Data ◽

Distributed Databases ◽

Species Conservation ◽

Distributed Database ◽

Database System ◽

Data Systems ◽

Web Interface ◽

Data Infrastructure ◽

Societas Europaea ◽

Distributed Database System

The Societas Europaea Herpetologica (SEH) decided in 2006 through its Mapping Committee to implement the New Atlas of Amphibians and Reptiles of Europe (NA2RE: http://na2re.ismai.pt) as a chorological database system. Initially designed to be a system of distributed databases, NA2RE quickly evolved to a Spatial Data Infrastructure, a system of geographically distributed systems. Each individual system has a national focus and is implemented in an online network, accessible through standard interfaces, thus allowing for interoperable communication and sharing of spatial-temporal data amongst one another. A Web interface facilitates the access of the user to all participating data systems as if it were one single virtual integrated data-source. Upon user request, the Web interface searches all distributed data-sources for the requested data, integrating the answers in an always updated and interactive map. This infrastructure implements methods for fast actualisation of national observation records, as well as for the use of a common taxonomy and systematics. Using this approach, data duplication is avoided, national systems are maintained in their own countries, and national organisations are responsible for their own data curation and management. The database could be built with different representation levels and resolution levels of data, and filtered according to species conservation matters. We present the first prototype of NA2RE, composed of the last data compilation performed by the SEH (Sillero et al., 2014). This system is implemented using only open source software: PostgreSQL database with PostGIS extension, Geoserver, and OpenLayers.

Download Full-text

An Enhanced Methodology on Internet of Things with Cloud in Smart Electrical Systems

International Journal of Recent Technology and Engineering - 2 ◽

10.35940/ijrte.c3842.098319 ◽

2019 ◽

Vol 8 (3) ◽

pp. 2295-2299

Keyword(s):

Internet Of Things ◽

Performance Management ◽

Data Privacy ◽

Homomorphic Encryption ◽

Human Life ◽

Vital Role ◽

Electrical System ◽

Privacy And Security ◽

Security Issues ◽

Electrical Systems

The smart management system plays a vital role in many domains and improves the reliability of protection and privacy of a system. Electrical systems have become a part in everyday human life. The next generation electrical systems will entirely depends on fully automated and smart control systems. In the present paper various mechanisms of cloud gateways and security issues are explored for smart management of an electrical system. The present survey work is reconnoitred with Internet of Things (IoT) in association with cloud. Cloud based IoT in smart electrical system provides potential enhancement of performance, management, and resilience of the smart system. However, the espousal of cloud based IoT system in smart electrical system to store and retrieve the data from cloud may increase risks in data privacy and security. Despite the different flaws in global integration of cloud with IoT through internet, various end-to-end security schemes are discussed to overcome these flaws. As a result in many of the applications easy IoT cloud gateway along with homomorphic encryption technique is set up to solve communication overheads and security issues.

Download Full-text

Implementasi Sistem Interkoneksi Basis Data Terdistribusi Menggunakan Socket API

Jurnal Komtika ◽

10.31603/komtika.v1i1.1680 ◽

2017 ◽

Vol 1 (1) ◽

pp. 1-4

Author(s):

Ary Yulianto

Keyword(s):

Information System ◽

Big Data ◽

Human Resources ◽

Distributed Databases ◽

Distributed Database ◽

Database System ◽

Master Data ◽

Distributed Database System ◽

The Moment ◽

Asset Data

Majelis Pendidikan Dasar dan Menengah Pimpinan Daerah Kabupaten Magelang needs information about the development of the school. The necessary data include data on human resources, asset data and data of their students. At the moment the data is obtained manually by looking to the schools. With the development of information technology is now expected to have a school information system that can provide information quickly and accurately. Data on each school if put into one large server or big data can be used to analyze the data. The purpose of this research is to implement the interconnection of distributed database system using the socket API. It is used to provide master data of all schools under Persyarikatan Muhammadiyah. To support the analysis of the data interconnection of distributed databases, socket API technology is used. With a system like this then the communications database can be connected so that the school's data will be sent directly through the socket API and the data will automatically be processed by the webservice in the server middleware and will be utilized in the next process.

Download Full-text

Implementation of Clustering Algorithms for Real Time Large Datasets

International Journal of Innovative Technology and Exploring Engineering - Special Issue ◽

10.35940/ijitee.c2570.0981119 ◽

2019 ◽

Vol 8 (11) ◽

pp. 2303-2304

Keyword(s):

Big Data ◽

Clustering Algorithms ◽

Vital Role ◽

Large Datasets ◽

Similar Data ◽

Data Set ◽

Survey Paper ◽

Density Based Clustering ◽

Geographical Maps ◽

Data Objects

Now a day’s clustering plays vital role in big data. It is very difficult to analyze and cluster large volume of data. Clustering is a procedure for grouping similar data objects of a data set. We make sure that inside the cluster high intra cluster similarity and outside the cluster high inter similarity. Clustering used in statistical analysis, geographical maps, biology cell analysis and in google maps. The various approaches for clustering grid clustering, density based clustering, hierarchical methods, partitioning approaches. In this survey paper we focused on all these algorithms for large datasets like big data and make a report on comparison among them. The main metric is time complexity to differentiate all algorithms.

Download Full-text

Null Values Treatment in Distributed Databases

Iraqi Journal for Computers and Informatics ◽

10.25195/ijci.v40i1.226 ◽

2002 ◽

Vol 40 (1) ◽

pp. 55-64

Author(s):

Saran Akram Abd Al-Majeed

Keyword(s):

Programming Languages ◽

Relational Databases ◽

Distributed Databases ◽

Distributed Database ◽

Database Management System ◽

Data Sources ◽

Relational Model ◽

A Value ◽

Null Values ◽

Distributed Database System

There has been a great deal of discussion about null values in relational databases. The relational model was defined in 1969, and Nulls Was died in 1979. Unfortunately, there is not a generally agreeable solution for rull values problem. Null is a special marker which stands for a value undefined or unknown, which means thut ne entry has been made, a missing valuc mark is not a value and not of a date type and cannot be treated as a value by Database Management System (DBMS). As we know, distributed database users are more than a single database and data will be distributed among several data sources or sites, it must be precise data, the replication is allowed there, so complex problems will appear, then there will be need for perfect practical general approaches for treatment of Nulls. A distributed database system is designed, that is "Hotel reservation control system, based on different data sources at four site, each site is represented as a Hotel, for more heterogeneity different application programming languages there are five practical approaches, designed with their rules and algorithms for Null values treatment through the distributed database sites. (1), (2), (3). 14). 15), (9).

Download Full-text

Rough ISODATA Algorithm

International Journal of Fuzzy System Applications ◽

10.4018/ijfsa.2013100101 ◽

2013 ◽

Vol 3 (4) ◽

pp. 1-14 ◽

Cited By ~ 2

Author(s):

S. Sampath ◽

B. Ramya

Keyword(s):

Clustering Algorithm ◽

Clustering Algorithms ◽

Real Life ◽

Vital Role ◽

Data Sets ◽

Clustering Method ◽

Data Set ◽

Number Of Clusters ◽

Real Life Data ◽

Nonparametric Statistical

Cluster analysis is a branch of data mining, which plays a vital role in bringing out hidden information in databases. Clustering algorithms help medical researchers in identifying the presence of natural subgroups in a data set. Different types of clustering algorithms are available in the literature. The most popular among them is k-means clustering. Even though k-means clustering is a popular clustering method widely used, its application requires the knowledge of the number of clusters present in the given data set. Several solutions are available in literature to overcome this limitation. The k-means clustering method creates a disjoint and exhaustive partition of the data set. However, in some situations one can come across objects that belong to more than one cluster. In this paper, a clustering algorithm capable of producing rough clusters automatically without requiring the user to give as input the number of clusters to be produced. The efficiency of the algorithm in detecting the number of clusters present in the data set has been studied with the help of some real life data sets. Further, a nonparametric statistical analysis on the results of the experimental study has been carried out in order to analyze the efficiency of the proposed algorithm in automatic detection of the number of clusters in the data set with the help of rough version of Davies-Bouldin index.

Download Full-text

Nonreplicated Static Data Allocation in Distributed Databases Using Biogeography-Based Optimization

Chinese Journal of Engineering ◽

10.1155/2014/785321 ◽

2014 ◽

Vol 2014 ◽

pp. 1-9 ◽

Cited By ~ 4

Author(s):

Arjan Singh ◽

Karanjeet Singh Kahlon ◽

Rajinder Singh Virk

Keyword(s):

Data Transfer ◽

Distributed Databases ◽

Distributed Database ◽

Database System ◽

Data Allocation ◽

Transmission Cost ◽

Static Data ◽

Total Data ◽

Transfer Cost ◽

Distributed Database System

Allocation of data is one of the key design issues of distributed database. A major cost of query execution in a distributed database system is the data transfer cost from one site to another site. The allocation of fragments among the different sites over the network plays an important role in performance of the distributed database system. The main objective of a data allocation in distributed database is to place the data fragments at different sites in such a way, so that the total data transfer cost can be minimized while executing a set of queries. In this paper, a new biogeography-based optimization (BBO) algorithm has been used to allocate the fragments during the design of distributed database system. The goal of this paper is to design a fragments allocation algorithm, so that the total data transmission cost can be minimized. To show the performance of proposed algorithm, results of biogeography-based optimization algorithm for data allocation are compared with genetic algorithm.

Download Full-text

Secure Data Analysis in Clusters (Iris Database)

Advances in Business Information Systems and Analytics - Handbook of Research on Advanced Data Mining Techniques and Applications for Business Intelligence ◽

10.4018/978-1-5225-2031-3.ch004 ◽

2017 ◽

pp. 52-61

Author(s):

Raghvendra Kumar ◽

Prasant Kumar Pattnaik ◽

Priyanka Pandey

Keyword(s):

Data Mining ◽

Data Analysis ◽

Privacy Preservation ◽

Data Communication ◽

Distributed Database ◽

Privacy Preserving ◽

Data Mining Algorithm ◽

Data Set ◽

Mining Algorithm ◽

Secure Data

This chapter used privacy preservation techniques (Data Modification) to ensure Privacy. Privacy preservation is another important issue. A picture, where number of clients owning their clustered databases (Iris Database) wish to run a data mining algorithm on the union of their databases, without revealing any unnecessary information and requires the privacy of the privileged information. There are numbers of efficient protocols are required for privacy preserving in data mining. This chapter presented various privacy preserving protocols that are used for security in clustered databases. The Xln(X) protocol and the secure sum protocol are used in mutual computing, which can defend privacy efficiently. Its focuses on the data modification techniques, where it has been modified our distributed database and after that sanded that modified data set to the client admin for secure data communication with zero percentage of data leakage and also reduce the communication and computation complexity.

Download Full-text