query performance Latest Research Papers

With database management systems becoming complex, predicting the execution time of graph queries before they are executed is one of the challenges for query scheduling, workload management, resource allocation, and progress monitoring. Through the comparison of query performance prediction methods, existing research works have solved such problems in traditional SQL queries, but they cannot be directly applied in Cypher queries on the Neo4j database. Additionally, most query performance prediction methods focus on measuring the relationship between correlation coefficients and retrieval performance. Inspired by machine-learning methods and graph query optimization technologies, we used the RBF neural network as a prediction model to train and predict the execution time of Cypher queries. Meanwhile, the corresponding query pattern features, graph data features, and query plan features were fused together and then used to train our prediction models. Furthermore, we also deployed a monitor node and designed a Cypher query benchmark for the database clusters to obtain the query plan information and native data store. The experimental results of four benchmarks showed that the average mean relative error of the RBF model reached 16.5% in the Northwind dataset, 12% in the FIFA2021 dataset, and 16.25% in the CORD-19 dataset. This experiment proves the effectiveness of our proposed approach on three real-world datasets.

Download Full-text

A semantic approach to post-retrieval query performance prediction

Information Processing & Management ◽

10.1016/j.ipm.2021.102746 ◽

2022 ◽

Vol 59 (1) ◽

pp. 102746

Author(s):

Parastoo Jafarzadeh ◽

Faezeh Ensan

Keyword(s):

Performance Prediction ◽

Semantic Approach ◽

Query Performance

Download Full-text

A Unification of Heterogeneous Data Sources into a Graph Model in E-commerce

Data Science and Engineering ◽

10.1007/s41019-021-00174-0 ◽

2021 ◽

Author(s):

Sonal Tuteja ◽

Rajeev Kumar

Keyword(s):

Large Scale ◽

Graph Model ◽

Heterogeneous Data ◽

Data Models ◽

Graph Models ◽

Query Performance ◽

Source Models ◽

Database Size ◽

Modelling Techniques ◽

Heterogeneous Source

AbstractThe incorporation of heterogeneous data models into large-scale e-commerce applications incurs various complexities and overheads, such as redundancy of data, maintenance of different data models, and communication among different models for query processing. Graphs have emerged as data modelling techniques for large-scale applications with heterogeneous, schemaless, and relationship-centric data. Models exist for mapping different types of data to a graph; however, the unification of data from heterogeneous source models into a graph model has not received much attention. To address this, we propose a new framework in this study. The proposed framework first transforms data from various source models into graph models individually and then unifies them into a single graph. To justify the applicability of the proposed framework in e-commerce applications, we analyse and compare query performance, scalability, and database size of the unified graph with heterogeneous source data models for a predefined set of queries. We also access some qualitative measures, such as flexibility, completeness, consistency, and maturity for the proposed unified graph. Based on the experimental results, the unified graph outperforms heterogeneous source models for query performance and scalability; however, it falls behind for database size.

Download Full-text

Scalable Learning to Troubleshoot Query Performance Problems

10.1145/3459637.3481947 ◽

2021 ◽

Author(s):

Alexandar Mihaylov ◽

Vincent Corvinelli ◽

Parke Godfrey ◽

Piotr Mierzejewski ◽

Jaroslaw Szlichta ◽

...

Keyword(s):

Query Performance ◽

Scalable Learning ◽

Performance Problems

Download Full-text

BERT-QPP: Contextualized Pre-trained transformers for Query Performance Prediction

10.1145/3459637.3482063 ◽

2021 ◽

Author(s):

Negar Arabzadeh ◽

Maryam Khodabakhsh ◽

Ebrahim Bagheri

Keyword(s):

Performance Prediction ◽

Query Performance

Download Full-text

A Neural Networks Approach to SPARQL Query Performance Prediction

10.1109/clei53233.2021.9639899 ◽

2021 ◽

Author(s):

Daniel Arturo Casal Amat ◽

Carlos Buil-Aranda ◽

Carlos Valle-Vidal

Keyword(s):

Neural Networks ◽

Performance Prediction ◽

Sparql Query ◽

Query Performance

Download Full-text

A new framework based on features modeling and ensemble learning to predict query performance

PLoS ONE ◽

10.1371/journal.pone.0258439 ◽

2021 ◽

Vol 16 (10) ◽

pp. e0258439

Author(s):

Mohamed Zaghloul ◽

Mofreh Salem ◽

Amr Ali-Eldin

Keyword(s):

Prediction Model ◽

Ensemble Learning ◽

Experimental Work ◽

Performance Prediction ◽

Missing Values ◽

Empirical Work ◽

Feature Modeling ◽

Training Dataset ◽

Query Performance ◽

New Framework

A query optimizer attempts to predict a performance metric based on the amount of time elapsed. Theoretically, this would necessitate the creation of a significant overhead on the core engine to provide the necessary query optimizing statistics. Machine learning is increasingly being used to improve query performance by incorporating regression models. To predict the response time for a query, most query performance approaches rely on DBMS optimizing statistics and the cost estimation of each operator in the query execution plan, which also focuses on resource utilization (CPU, I/O). Modeling query features is thus a critical step in developing a robust query performance prediction model. In this paper, we propose a new framework based on query feature modeling and ensemble learning to predict query performance and use this framework as a query performance predictor simulator to optimize the query features that influence query performance. In query feature modeling, we propose five dimensions used to model query features. The query features dimensions are syntax, hardware, software, data architecture, and historical performance logs. These features will be based on developing training datasets for the performance prediction model that employs the ensemble learning model. As a result, ensemble learning leverages the query performance prediction problem to deal with missing values. Handling overfitting via regularization. The section on experimental work will go over how to use the proposed framework in experimental work. The training dataset in this paper is made up of performance data logs from various real-world environments. The outcomes were compared to show the difference between the actual and expected performance of the proposed prediction model. Empirical work shows the effectiveness of the proposed approach compared to related work.

Download Full-text

Comparing Alternative Storage Models for Words Extracted from Legal Texts

10.5753/sbbd_estendido.2021.18160 ◽

2021 ◽

Author(s):

Ana Paula Sodré ◽

Luis Eduardo Mochenski Floriano ◽

Dimmy Magalhães ◽

Cristina D. Aguiar ◽

Aurora Pozo ◽

...

Keyword(s):

Data Warehouse ◽

Performance Improvement ◽

Federal Court ◽

Quantitative Information ◽

Judicial System ◽

Query Performance ◽

Legal Texts ◽

Legal Expert

The COVID-19 pandemic created new demands for services in the judicial system, requiring the use of a data warehouse (DW). Although there exist approaches that use DW in the judicial domain, few target the pandemic or publicly provide the information extracted from the texts. Following the needs of a legal expert, we have developed the COVID-19 Portal. It extracts documents from the Supreme Federal Court in Brazil to obtain quantitative information on words used in the texts. In this paper, we present the design of a DW, and show the query performance improvement achieved with its implementation. The DW has been developed on Postgres, and its performance is compared with the original implementation on MongoDB Cloud and a local MongoDB database.

Download Full-text

A Comparative Analysis of Knowledge Graph Query Performance

10.1109/transai51903.2021.00014 ◽

2021 ◽

Author(s):

Masoud Salehpour ◽

Joseph G. Davis

Keyword(s):

Comparative Analysis ◽

Knowledge Graph ◽

Query Performance ◽

Graph Query ◽

Analysis Of Knowledge

Download Full-text

PoBery: Possibly-complete Big Data Queries with Probabilistic Data Placement and Scanning

ACM/IMS Transactions on Data Science ◽

10.1145/3465375 ◽

2021 ◽

Vol 2 (3) ◽

pp. 1-28

Author(s):

Jie Song ◽

Qiang He ◽

Feifei Chen ◽

Ye Yuan ◽

Ge Yu

Keyword(s):

Big Data ◽

Query Processing ◽

State Of The Art ◽

Data Placement ◽

Probabilistic Data ◽

Trade Off ◽

Query Performance ◽

Data Query ◽

Query Efficiency ◽

The Given

In big data query processing, there is a trade-off between query accuracy and query efficiency, for example, sampling query approaches trade-off query completeness for efficiency. In this article, we argue that query performance can be significantly improved by slightly losing the possibility of query completeness, that is, the chance that a query is complete. To quantify the possibility, we define a new concept, Probability of query Completeness (hereinafter referred to as PC). For example, If a query is executed 100 times, PC = 0.95 guarantees that there are no more than 5 incomplete results among 100 results. Leveraging the probabilistic data placement and scanning, we trade off PC for query performance. In the article, we propose PoBery (POssibly-complete Big data quERY), a method that supports neither complete queries nor incomplete queries, but possibly-complete queries. The experimental results conducted on HiBench prove that PoBery can significantly accelerate queries while ensuring the PC. Specifically, it is guaranteed that the percentage of complete queries is larger than the given PC confidence. Through comparison with state-of-the-art key-value stores, we show that while Drill-based PoBery performs as fast as Drill on complete queries, it is 1.7 ×, 1.1 ×, and 1.5 × faster on average than Drill, Impala, and Hive, respectively, on possibly-complete queries.

Download Full-text

query performance
Recently Published Documents

TOTAL DOCUMENTS

H-INDEX

Execution Time Prediction for Cypher Queries in the Neo4j Database Using a Learning Approach

A semantic approach to post-retrieval query performance prediction

A Unification of Heterogeneous Data Sources into a Graph Model in E-commerce

Scalable Learning to Troubleshoot Query Performance Problems

BERT-QPP: Contextualized Pre-trained transformers for Query Performance Prediction

A Neural Networks Approach to SPARQL Query Performance Prediction

A new framework based on features modeling and ensemble learning to predict query performance

Comparing Alternative Storage Models for Words Extracted from Legal Texts

A Comparative Analysis of Knowledge Graph Query Performance

PoBery: Possibly-complete Big Data Queries with Probabilistic Data Placement and Scanning

Export Citation Format

query performanceRecently Published Documents

TOTAL DOCUMENTS

H-INDEX

Execution Time Prediction for Cypher Queries in the Neo4j Database Using a Learning Approach

A semantic approach to post-retrieval query performance prediction

A Unification of Heterogeneous Data Sources into a Graph Model in E-commerce

Scalable Learning to Troubleshoot Query Performance Problems

BERT-QPP: Contextualized Pre-trained transformers for Query Performance Prediction

A Neural Networks Approach to SPARQL Query Performance Prediction

A new framework based on features modeling and ensemble learning to predict query performance

Comparing Alternative Storage Models for Words Extracted from Legal Texts

A Comparative Analysis of Knowledge Graph Query Performance

PoBery: Possibly-complete Big Data Queries with Probabilistic Data Placement and Scanning

query performance
Recently Published Documents