The Combination and Evaluation of Query Performance Prediction Methods

With database management systems becoming complex, predicting the execution time of graph queries before they are executed is one of the challenges for query scheduling, workload management, resource allocation, and progress monitoring. Through the comparison of query performance prediction methods, existing research works have solved such problems in traditional SQL queries, but they cannot be directly applied in Cypher queries on the Neo4j database. Additionally, most query performance prediction methods focus on measuring the relationship between correlation coefficients and retrieval performance. Inspired by machine-learning methods and graph query optimization technologies, we used the RBF neural network as a prediction model to train and predict the execution time of Cypher queries. Meanwhile, the corresponding query pattern features, graph data features, and query plan features were fused together and then used to train our prediction models. Furthermore, we also deployed a monitor node and designed a Cypher query benchmark for the database clusters to obtain the query plan information and native data store. The experimental results of four benchmarks showed that the average mean relative error of the RBF model reached 16.5% in the Northwind dataset, 12% in the FIFA2021 dataset, and 16.25% in the CORD-19 dataset. This experiment proves the effectiveness of our proposed approach on three real-world datasets.

Download Full-text

Evaluation of Query Performance Prediction Methods by Range

String Processing and Information Retrieval - Lecture Notes in Computer Science ◽

10.1007/978-3-642-16321-0_23 ◽

2010 ◽

pp. 225-236

Author(s):

Joaquín Pérez-Iglesias ◽

Lourdes Araujo

Keyword(s):

Performance Prediction ◽

Prediction Methods ◽

Query Performance

Download Full-text

When is query performance prediction effective?

Proceedings of the 32nd international ACM SIGIR conference on Research and development in information retrieval - SIGIR '09 ◽

10.1145/1571941.1572150 ◽

2009 ◽

Cited By ~ 4

Author(s):

Claudia Hauff ◽

Leif Azzopardi

Keyword(s):

Performance Prediction ◽

Query Performance

Download Full-text

Forward and backward feature selection for query performance prediction

Proceedings of the 35th Annual ACM Symposium on Applied Computing ◽

10.1145/3341105.3373904 ◽

2020 ◽

Cited By ~ 1

Author(s):

Sébastien Déjean ◽

Radu Tudor Ionescu ◽

Josiane Mothe ◽

Md Zia Ullah

Keyword(s):

Feature Selection ◽

Performance Prediction ◽

Query Performance ◽

Selection For

Download Full-text

Multi-metric Graph Query Performance Prediction

Database Systems for Advanced Applications - Lecture Notes in Computer Science ◽

10.1007/978-3-319-91452-7_19 ◽

2018 ◽

pp. 289-306

Author(s):

Keyvan Sasani ◽

Mohammad Hossein Namaki ◽

Yinghui Wu ◽

Assefaw H. Gebremedhin

Keyword(s):

Performance Prediction ◽

Query Performance ◽

Metric Graph ◽

Graph Query

Download Full-text

Query performance prediction for microblog search

Information Processing & Management ◽

10.1016/j.ipm.2017.08.002 ◽

2017 ◽

Vol 53 (6) ◽

pp. 1320-1341 ◽

Cited By ~ 5

Author(s):

Maram Hasanain ◽

Tamer Elsayed

Keyword(s):

Performance Prediction ◽

Query Performance ◽

Microblog Search

Download Full-text

Neural Embedding-Based Metrics for Pre-retrieval Query Performance Prediction

10.32920/ryerson.14654253.v1 ◽

2021 ◽

Author(s):

Arabzadehghahyazi Negar

Keyword(s):

Performance Prediction ◽

State Of The Art ◽

Learning To Rank ◽

The State ◽

Test Collection ◽

Query Performance ◽

Performance Predictors ◽

Level Statistics ◽

Ablation Study ◽

Individual Specificity

file:///C:/Users/MWF/Downloads/Arabzadehghahyazi, Negar.Pre-retrieval Query Performance Prediction (QPP) methods are oblivious to the performance of the retrieval model as they predict query difficulty prior to observing the set of documents retrieved for the query. Among pre-retrieval query performance predictors, specificity-based metrics investigate how corpus, query and corpus-query level statistics can be used to predict the performance of the query. In this thesis, we explore how neural embeddings can be utilized to define corpus-independent and semantics-aware specificity metrics. Our metrics are based on the intuition that a term that is closely surrounded by other terms in the embedding space is more likely to be specific while a term surrounded by less closely related terms is more likely to be generic. On this basis, we leverage geometric properties between embedded terms to define four groups of metrics: (1) neighborhood-based, (2) graph-based, (3) cluster-based and (4) vector-based metrics. Moreover, we employ learning-to-rank techniques to analyze the importance of individual specificity metrics. To evaluate the proposed metrics, we have curated and publicly share a test collection of term specificity measurements defined based on Wikipedia category hierarchy and DMOZ taxonomy. We report on our extensive experiments on the effectiveness of our metrics through metric comparison, ablation study and comparison against the state-of-the-art baselines. We have shown that our proposed set of pre-retrieval QPP metrics based on the properties of pre-trained neural embeddings are more effective for performance prediction compared to the state-of-the-art methods. We report our findings based on Robust04, ClueWeb09 and Gov2 corpora and their associated TREC topics.

Download Full-text