query plan Latest Research Papers

With database management systems becoming complex, predicting the execution time of graph queries before they are executed is one of the challenges for query scheduling, workload management, resource allocation, and progress monitoring. Through the comparison of query performance prediction methods, existing research works have solved such problems in traditional SQL queries, but they cannot be directly applied in Cypher queries on the Neo4j database. Additionally, most query performance prediction methods focus on measuring the relationship between correlation coefficients and retrieval performance. Inspired by machine-learning methods and graph query optimization technologies, we used the RBF neural network as a prediction model to train and predict the execution time of Cypher queries. Meanwhile, the corresponding query pattern features, graph data features, and query plan features were fused together and then used to train our prediction models. Furthermore, we also deployed a monitor node and designed a Cypher query benchmark for the database clusters to obtain the query plan information and native data store. The experimental results of four benchmarks showed that the average mean relative error of the RBF model reached 16.5% in the Northwind dataset, 12% in the FIFA2021 dataset, and 16.25% in the CORD-19 dataset. This experiment proves the effectiveness of our proposed approach on three real-world datasets.

Download Full-text

PerfGuard

Proceedings of the VLDB Endowment ◽

10.14778/3484224.3484233 ◽

2021 ◽

Vol 14 (13) ◽

pp. 3362-3375

Author(s):

Remmelt Ammerlaan ◽

Gilbert Antonius ◽

Marc Friedman ◽

H M Sajjad Hossain ◽

Alekh Jindal ◽

...

Keyword(s):

Machine Learning ◽

Deep Learning ◽

Data Processing ◽

Relational Databases ◽

Entire Space ◽

Query Plan ◽

Future Improvement ◽

Massive Scale ◽

End To End ◽

Intractable Problem

Modern data processing systems require optimization at massive scale, and using machine learning to optimize these systems (ML-for-systems) has shown promising results. Unfortunately, ML-for-systems is subject to over generalizations that do not capture the large variety of workload patterns, and tend to augment the performance of certain subsets in the workload while regressing performance for others. In this paper, we introduce a performance safeguard system, called PerfGuard , that designs pre-production experiments for deploying ML-for-systems. Instead of searching the entire space of query plans (a well-known, intractable problem), we focus on query plan deltas (a significantly smaller space). PerfGuard formalizes these differences, and correlates plan deltas to important feedback signals, like execution cost. We describe the deep learning architecture and the end-to-end pipeline in PerfGuard that could be used with general relational databases. We show that this architecture improves on baseline models, and that our pipeline identifies key query plan components as major contributors to plan disparity. Offline experimentation shows PerfGuard as a promising approach, with many opportunities for future improvement.

Download Full-text

Modified Firefly Algorithm for Optimizing Biomedical Breast Cancer Queries

10.21203/rs.3.rs-171211/v1 ◽

2021 ◽

Author(s):

Gomathi Ramalingam

Keyword(s):

Breast Cancer ◽

Semantic Web ◽

Execution Time ◽

Firefly Algorithm ◽

Optimal Solution ◽

Query Languages ◽

Query Execution ◽

Web Data ◽

Query Plan ◽

Modified Firefly Algorithm

Abstract Querying and retrieving Semantic Web data is a challenging task due to the increment in its volume. Many query languages were designed to retrieve Semantic Web data. A popular querying method of communication in Semantic Web is SPARQL. The query languages were designed with some optimization strategies, and it was found in literature that these query languages were not able to handle large volume of data efficiently. In this research, a Modified Firefly Algorithm (MFA) is applied to optimize the SPARQL queries so that it can retrieve data from a large Semantic Web repository efficiently by reducing query execution time. Every query will have multiple query plans generated with different cost values. The challenge is to choose the best query plan which reduces the query cost and query execution time. The proposed algorithm uses the best query plan in the previous iteration to calculate the distance between two query plans using the radius parameter. The proposed algorithm generates a query plan which is a global optimal solution. MFA is evaluated using the BioPortal dataset with triples containing breast cancer. Experimental analysis is conducted to identify the significant improvement in performance of the proposed work with the existing nature inspired query optimization algorithms. The efficiency of MFA is compared with other algorithms in terms of query execution time and the performance is evaluated.

Download Full-text

An automatic clustering technique for query plan recommendation

Information Sciences ◽

10.1016/j.ins.2020.09.037 ◽

2021 ◽

Vol 545 ◽

pp. 620-632

Author(s):

Elham Azhir ◽

Nima Jafari Navimipour ◽

Mehdi Hosseinzadeh ◽

Arash Sharifi ◽

Aso Darwesh

Keyword(s):

Query Plan ◽

Clustering Technique ◽

Automatic Clustering

Download Full-text

An empirical evaluation of cost-based federated SPARQL query processing engines

Semantic Web ◽

10.3233/sw-200420 ◽

2021 ◽

pp. 1-26

Author(s):

Umair Qudus ◽

Muhammad Saleem ◽

Axel-Cyrille Ngonga Ngomo ◽

Young-Koo Lee

Keyword(s):

Query Processing ◽

Detailed Analysis ◽

Performance Metrics ◽

Empirical Evaluation ◽

Sparql Query ◽

Evaluation Metrics ◽

Future Cost ◽

Query Plan ◽

Fine Grained ◽

Runtime Performance

Finding a good query plan is key to the optimization of query runtime. This holds in particular for cost-based federation engines, which make use of cardinality estimations to achieve this goal. A number of studies compare SPARQL federation engines across different performance metrics, including query runtime, result set completeness and correctness, number of sources selected and number of requests sent. Albeit informative, these metrics are generic and unable to quantify and evaluate the accuracy of the cardinality estimators of cost-based federation engines. To thoroughly evaluate cost-based federation engines, the effect of estimated cardinality errors on the overall query runtime performance must be measured. In this paper, we address this challenge by presenting novel evaluation metrics targeted at a fine-grained benchmarking of cost-based federated SPARQL query engines. We evaluate five cost-based federated SPARQL query engines using existing as well as novel evaluation metrics by using LargeRDFBench queries. Our results provide a detailed analysis of the experimental outcomes that reveal novel insights, useful for the development of future cost-based federated SPARQL query processing engines.

Download Full-text

ESTIMATION OF ATTRIBUTE VALUES IN JOIN TABLES WHILE OPTIMIZING RELATION-AL DATABASE QUERY

Informatika i sistemy upravleniya ◽

10.22250/isu.2021.67.3-18 ◽

2021 ◽

pp. 3-18

Author(s):

Y.A. Grigorev ◽

◽

O.Yu. Pluzhnikova ◽

Keyword(s):

Relational Database ◽

Database Query ◽

Query Plan ◽

Cardinality Estimation ◽

The Cost

The article analyzes the problem of estimating join tables cardinality in the process of calculating the cost of relational database query plan. A new algorithm for estimating the distinct values of attributes is proposed. The algorithm allows reducing inaccuracy in cardinality estimation. The consistency of proposed algorithm is proved.

Download Full-text

Change the Query Plan

MySQL 8 Query Performance Tuning ◽

10.1007/978-1-4842-5584-1_24 ◽

2020 ◽

pp. 793-851

Author(s):

Jesper Wisborg Krogh

Keyword(s):

Query Plan

Download Full-text

Analysis of Two Phase Query Optimization Algorithm for Generating Optimal Query Plan using Randomized Algorithm

SSRN Electronic Journal ◽

10.2139/ssrn.3579179 ◽

2020 ◽

Author(s):

Pramod Kumar Yadav ◽

Sam Rizvi

Keyword(s):

Query Optimization ◽

Optimization Algorithm ◽

Randomized Algorithm ◽

Two Phase ◽

Query Plan

Download Full-text

Cost-Based Query Optimization in Centralized Relational Databases

Journal of Institute of Science and Technology ◽

10.3126/jist.v24i1.24627 ◽

2019 ◽

Vol 24 (1) ◽

pp. 42-46

Author(s):

Nawaraj Paudel ◽

Jagdish Bhatta

Keyword(s):

Query Optimization ◽

Relational Databases ◽

Query Language ◽

Database Management System ◽

Query Plan ◽

Efficiency Cost ◽

The Cost ◽

Disk Access ◽

The Given ◽

Relational Database Management

Query optimization is the most significant factor for any centralized relational database management system (RDBMS) that reduces the total execution time of a query. Query optimization is the process of executing a SQL (Structured Query Language) query in relational databases to determine the most efficient way to execute a given query by considering the possible query plans. The goal of query optimization is to optimize the given query for the sake of efficiency. Cost-based query optimization compares different strategies based on relative costs (amount of time that the query needs to run) and selects and executes one that minimizes the cost. The cost of a strategy is just an estimate based on how many estimated CPU and I/O resources that the query will use. In this paper, cost is considered by counting number of disk accesses for each query plan because disk access tends to be the dominant cost in query processing for centralized relational databases.

Download Full-text

Automated Configuration Parameter Classfication Model for Hive Query Plan on the Apache Yarn

2019 IEEE International Conference on Big Data, Cloud Computing, Data Science & Engineering (BCD) ◽

10.1109/bcd.2019.8885220 ◽

2019 ◽

Author(s):

Jongyeop Kim ◽

Seongsoo Kim ◽

Donghoon Kim ◽

Hong Liu

Keyword(s):

Query Plan ◽

Configuration Parameter

Download Full-text

query plan
Recently Published Documents

TOTAL DOCUMENTS

H-INDEX

Execution Time Prediction for Cypher Queries in the Neo4j Database Using a Learning Approach

PerfGuard

Modified Firefly Algorithm for Optimizing Biomedical Breast Cancer Queries

An automatic clustering technique for query plan recommendation

An empirical evaluation of cost-based federated SPARQL query processing engines

ESTIMATION OF ATTRIBUTE VALUES IN JOIN TABLES WHILE OPTIMIZING RELATION-AL DATABASE QUERY

Change the Query Plan

Analysis of Two Phase Query Optimization Algorithm for Generating Optimal Query Plan using Randomized Algorithm

Cost-Based Query Optimization in Centralized Relational Databases

Automated Configuration Parameter Classfication Model for Hive Query Plan on the Apache Yarn

Export Citation Format

query planRecently Published Documents

TOTAL DOCUMENTS

H-INDEX

Execution Time Prediction for Cypher Queries in the Neo4j Database Using a Learning Approach

PerfGuard

Modified Firefly Algorithm for Optimizing Biomedical Breast Cancer Queries

An automatic clustering technique for query plan recommendation

An empirical evaluation of cost-based federated SPARQL query processing engines

ESTIMATION OF ATTRIBUTE VALUES IN JOIN TABLES WHILE OPTIMIZING RELATION-AL DATABASE QUERY

Change the Query Plan

Analysis of Two Phase Query Optimization Algorithm for Generating Optimal Query Plan using Randomized Algorithm

Cost-Based Query Optimization in Centralized Relational Databases

Automated Configuration Parameter Classfication Model for Hive Query Plan on the Apache Yarn

query plan
Recently Published Documents