query plan
Recently Published Documents


TOTAL DOCUMENTS

49
(FIVE YEARS 12)

H-INDEX

4
(FIVE YEARS 1)

Symmetry ◽  
2022 ◽  
Vol 14 (1) ◽  
pp. 55
Author(s):  
Zhenzhen He ◽  
Jiong Yu ◽  
Binglei Guo

With database management systems becoming complex, predicting the execution time of graph queries before they are executed is one of the challenges for query scheduling, workload management, resource allocation, and progress monitoring. Through the comparison of query performance prediction methods, existing research works have solved such problems in traditional SQL queries, but they cannot be directly applied in Cypher queries on the Neo4j database. Additionally, most query performance prediction methods focus on measuring the relationship between correlation coefficients and retrieval performance. Inspired by machine-learning methods and graph query optimization technologies, we used the RBF neural network as a prediction model to train and predict the execution time of Cypher queries. Meanwhile, the corresponding query pattern features, graph data features, and query plan features were fused together and then used to train our prediction models. Furthermore, we also deployed a monitor node and designed a Cypher query benchmark for the database clusters to obtain the query plan information and native data store. The experimental results of four benchmarks showed that the average mean relative error of the RBF model reached 16.5% in the Northwind dataset, 12% in the FIFA2021 dataset, and 16.25% in the CORD-19 dataset. This experiment proves the effectiveness of our proposed approach on three real-world datasets.


2021 ◽  
Vol 14 (13) ◽  
pp. 3362-3375
Author(s):  
Remmelt Ammerlaan ◽  
Gilbert Antonius ◽  
Marc Friedman ◽  
H M Sajjad Hossain ◽  
Alekh Jindal ◽  
...  

Modern data processing systems require optimization at massive scale, and using machine learning to optimize these systems (ML-for-systems) has shown promising results. Unfortunately, ML-for-systems is subject to over generalizations that do not capture the large variety of workload patterns, and tend to augment the performance of certain subsets in the workload while regressing performance for others. In this paper, we introduce a performance safeguard system, called PerfGuard , that designs pre-production experiments for deploying ML-for-systems. Instead of searching the entire space of query plans (a well-known, intractable problem), we focus on query plan deltas (a significantly smaller space). PerfGuard formalizes these differences, and correlates plan deltas to important feedback signals, like execution cost. We describe the deep learning architecture and the end-to-end pipeline in PerfGuard that could be used with general relational databases. We show that this architecture improves on baseline models, and that our pipeline identifies key query plan components as major contributors to plan disparity. Offline experimentation shows PerfGuard as a promising approach, with many opportunities for future improvement.


2021 ◽  
Author(s):  
Gomathi Ramalingam

Abstract Querying and retrieving Semantic Web data is a challenging task due to the increment in its volume. Many query languages were designed to retrieve Semantic Web data. A popular querying method of communication in Semantic Web is SPARQL. The query languages were designed with some optimization strategies, and it was found in literature that these query languages were not able to handle large volume of data efficiently. In this research, a Modified Firefly Algorithm (MFA) is applied to optimize the SPARQL queries so that it can retrieve data from a large Semantic Web repository efficiently by reducing query execution time. Every query will have multiple query plans generated with different cost values. The challenge is to choose the best query plan which reduces the query cost and query execution time. The proposed algorithm uses the best query plan in the previous iteration to calculate the distance between two query plans using the radius parameter. The proposed algorithm generates a query plan which is a global optimal solution. MFA is evaluated using the BioPortal dataset with triples containing breast cancer. Experimental analysis is conducted to identify the significant improvement in performance of the proposed work with the existing nature inspired query optimization algorithms. The efficiency of MFA is compared with other algorithms in terms of query execution time and the performance is evaluated.


2021 ◽  
Vol 545 ◽  
pp. 620-632
Author(s):  
Elham Azhir ◽  
Nima Jafari Navimipour ◽  
Mehdi Hosseinzadeh ◽  
Arash Sharifi ◽  
Aso Darwesh

Semantic Web ◽  
2021 ◽  
pp. 1-26
Author(s):  
Umair Qudus ◽  
Muhammad Saleem ◽  
Axel-Cyrille Ngonga Ngomo ◽  
Young-Koo Lee

Finding a good query plan is key to the optimization of query runtime. This holds in particular for cost-based federation engines, which make use of cardinality estimations to achieve this goal. A number of studies compare SPARQL federation engines across different performance metrics, including query runtime, result set completeness and correctness, number of sources selected and number of requests sent. Albeit informative, these metrics are generic and unable to quantify and evaluate the accuracy of the cardinality estimators of cost-based federation engines. To thoroughly evaluate cost-based federation engines, the effect of estimated cardinality errors on the overall query runtime performance must be measured. In this paper, we address this challenge by presenting novel evaluation metrics targeted at a fine-grained benchmarking of cost-based federated SPARQL query engines. We evaluate five cost-based federated SPARQL query engines using existing as well as novel evaluation metrics by using LargeRDFBench queries. Our results provide a detailed analysis of the experimental outcomes that reveal novel insights, useful for the development of future cost-based federated SPARQL query processing engines.


Author(s):  
Y.A. Grigorev ◽  
◽  
O.Yu. Pluzhnikova ◽  

The article analyzes the problem of estimating join tables cardinality in the process of calculating the cost of relational database query plan. A new algorithm for estimating the distinct values of attributes is proposed. The algorithm allows reducing inaccuracy in cardinality estimation. The consistency of proposed algorithm is proved.


2020 ◽  
pp. 793-851
Author(s):  
Jesper Wisborg Krogh
Keyword(s):  

2019 ◽  
Vol 24 (1) ◽  
pp. 42-46
Author(s):  
Nawaraj Paudel ◽  
Jagdish Bhatta

Query optimization is the most significant factor for any centralized relational database management system (RDBMS) that reduces the total execution time of a query. Query optimization is the process of executing a SQL (Structured Query Language) query in relational databases to determine the most efficient way to execute a given query by considering the possible query plans. The goal of query optimization is to optimize the given query for the sake of efficiency. Cost-based query optimization compares different strategies based on relative costs (amount of time that the query needs to run) and selects and executes one that minimizes the cost. The cost of a strategy is just an estimate based on how many estimated CPU and I/O resources that the query will use. In this paper, cost is considered by counting number of disk accesses for each query plan because disk access tends to be the dominant cost in query processing for centralized relational databases.


Sign in / Sign up

Export Citation Format

Share Document