Distributed Query Plan Generation using Ant Colony Optimization

2015 ◽  
Vol 6 (1) ◽  
pp. 1-22 ◽  
Author(s):  
T.V. Vijay Kumar ◽  
Rahul Singh ◽  
Amit Kumar

Query processing is a critical performance evaluation parameter and has received a considerable amount of attention especially in the context of distributed database systems. The aim of distributed query processing is to effectively and efficiently process the query. This entails laying down an optimal distributed query processing strategy that generates efficient query plans Since in distributed database systems, the data is distributed and replicated at multiple sites, the number of query plans increases exponentially with increase in the number of relations accessed by the query along with increase in the number of sites containing these relations. Thus, from amongst these query plans, there is a need to generate optimal query plans involving lesser number of sites which, in turn, would entail lower site-to-site communication cost leading to faster query response times. In this paper, an attempt has been made to generate such query plans for a distributed query using Ant Colony Optimization (ACO). This ACO based distributed query plan generation (DQPG) algorithm, when compared with the GA based DQPG algorithm, is able to generate comparatively better quality Top-K query plans for a given distributed query.

2017 ◽  
Vol 8 (1) ◽  
pp. 1-26 ◽  
Author(s):  
Jay Prakash ◽  
Neha Singh ◽  
T.V. Vijay Kumar

In distributed database systems, relations are replicated and fragmented at multiple sites to ensure easy availability and greater reliability. This leads to an exponential increase in the possible alternatives available for selecting the set of sites, constituting a query plan, for processing. Computing the optimal query plans, from amongst all possible query plans, is a discrete combinatorial optimization problem. This Distributed Query Plan Generation (DQPG) problem has been addressed using Bacterial Foraging Optimization (BFO) in this paper. Here, a novel BFO based DQPG algorithm (DQPGBFO), which generates the Top-K distributed query plans having the minimum total query processing cost, has been proposed. Experimental comparison of DQPGBFO with the existing Genetic Algorithm (GA) based DQPG algorithm (DQPGGA) shows that the former is able to generate Top-K query plans that have a comparatively lower total cost of processing a distributed query. This, in turn, leads to a reduction in the query response time and thus aids in decision making.


2017 ◽  
Vol 6 (1) ◽  
pp. 86-100
Author(s):  
Monika Yadav ◽  
T. V. Vijay Kumar

Query processing in distributed databases involves data transmission amongst sites capable of providing answers to a distributed query. For this, a distributed query processing strategy, which generates efficient query processing plans for a given distributed query, needs to be devised. Since in distributed databases, the data is fragmented and replicated at multiple sites, the number of query plans increases exponentially with increase in the number of sites capable of providing answers to a distributed query. As a result, generating efficient query processing plans, from amongst all possible query plans, becomes a complex problem. This distributed query plan generation (DQPG) problem has been addressed using the Cuckoo Search Algorithm (CSA) in this paper. Accordingly, a CSA based DQPG algorithm (DQPGCSA) that aims to generate Top-K query plans having minimum cost of processing a distributed query has been proposed. Experimental based comparison of DQPGCSA with the existing GA based DQPG algorithm shows that the former is able to generate Top-K query plans that have a comparatively lower query processing cost. This, in turn, reduces the query response time resulting in efficient decision making.


2013 ◽  
Vol 4 (3) ◽  
pp. 58-82 ◽  
Author(s):  
T.V. Vijay Kumar ◽  
Amit Kumar ◽  
Rahul Singh

A large number of queries are posed on databases spread across the globe. In order to process these queries efficiently, optimal query processing strategies that generate efficient query processing plans are being devised. In distributed relational database systems, due to replication of relations at multiple sites, the relations required to answer a query may necessitate accessing of data from multiple sites. This leads to an exponential increase in the number of possible alternative query plans for processing a query. Though it is not computationally feasible to explore all possible query plans in such a large search space, the query plan that provides the most cost-effective option for query processing is considered necessary and should be generated for a given query. In this paper, an attempt has been made to generate such optimal query plans using Set based Comprehensive Learning Particle Swarm Optimization (S-CLPSO). Experimental comparisons of this algorithm with the GA based distributed query plan generation algorithm shows that for higher number of relations, the S-CLPSO based algorithm is able to generate comparatively better quality Top-K query plans.


Author(s):  
Cyrus Shahabi ◽  
Farnoush Banaei-Kashani

Recently, a family of massive self-organizing data networks has emerged. These networks mainly serve as large-scale distributed query-processing systems. We term these networks querical data networks (QDN). A QDN is a federation of a dynamic set of peer, autonomous nodes communicating through a transient-form interconnection. Data is naturally distributed among the QDN nodes in extra-fine grain, where a few data items are dynamically created, collected, and/or stored at each node. Therefore, the network scales linearly to the size of the data set. With a dynamic data set, a dynamic and large set of nodes, and a transient-form communication infrastructure, QDNs should be considered as the new generation of distributed database systems with significantly less constraining assumptions as compared to their ancestors. Peer-to-peer networks (Daswani, Garcia-Molina, & Yang, 2003) and sensor networks (Akyildiz, Su, Sankarasubramaniam, & Cayirci, 2002; Estrin, Govindan, Heidemann, & Kumar, 1999) are well-known examples of QDNs.


Author(s):  
Cyrus Shahabi ◽  
Farnoush Banaei-Kashani

Recently, a family of massive self-organizing data networks has emerged. These networks mainly serve as large-scale distributed query processing systems. We term these networks Querical Data Networks (QDN). A QDN is a federation of a dynamic set of peer, autonomous nodes communicating through a transient-form interconnection. Data is naturally distributed among the QDN nodes in extra-fine grain, where a few data items are dynamically created, collected, and/or stored at each node. Therefore, the network scales linearly to the size of the dataset. With a dynamic dataset, a dynamic and large set of nodes, and a transient-form communication infrastructure, QDNs should be considered as the new generation of distributed database systems with significantly less constraining assumptions as compared to their ancestors. Peer-to-peer networks (Daswani, 2003) and sensor networks (Estrin, 1999, Akyildiz, 2002) are well-known examples of QDN.


Sign in / Sign up

Export Citation Format

Share Document