query optimizers
Recently Published Documents


TOTAL DOCUMENTS

39
(FIVE YEARS 10)

H-INDEX

8
(FIVE YEARS 1)

2021 ◽  
Vol 37 (3) ◽  
pp. 223-238
Author(s):  
Hung Q. Ngo

I would like to dedicate this little exposition to Prof. Phan Dinh Dieu, one of the giants and pioneers of Mathematics in Computer Science in Vietnam. In the past 15 years or so, new and exciting connections between fundamental problems in database theory and information theory have emerged. There are several angles one can take to describe this connection. This paper takes one such angle, influenced by the author's own bias and research results. In particular, we will describe how the cardinality estimation problem -- a corner-stone problem for query optimizers -- is deeply connected to information theoretic inequalities. Furthermore, we explain how inequalities can also be used to derive a couple of classic geometric inequalities such as the Loomis-Whitney inequality. A purpose of the article is to introduce the reader to these new connections, where theory and practice meet in a wonderful way. Another objective is to point the reader to a research area with many new open questions.  


2021 ◽  
Author(s):  
Richard T. Snodgrass ◽  
Sabah Currim ◽  
Young-Kyoon Suh
Keyword(s):  

2021 ◽  
Vol 15 (1) ◽  
pp. 85-97
Author(s):  
Ji Sun ◽  
Jintao Zhang ◽  
Zhaoyan Sun ◽  
Guoliang Li ◽  
Nan Tang

Cardinality estimation is core to the query optimizers of DBMSs. Non-learned methods, especially based on histograms and samplings, have been widely used in commercial and open-source DBMSs. Nevertheless, histograms and samplings can only be used to summarize one or few columns, which fall short of capturing the joint data distribution over an arbitrary combination of columns, because of the oversimplification of histograms and samplings over the original relational table(s). Consequently, these traditional methods typically make bad predictions for hard cases such as queries over multiple columns, with multiple predicates, and joins between multiple tables. Recently, learned cardinality estimators have been widely studied. Because these learned estimators can better capture the data distribution and query characteristics, empowered by the recent advance of (deep learning) models, they outperform non-learned methods on many cases. The goals of this paper are to provide a design space exploration of learned cardinality estimators and to have a comprehensive comparison of the SOTA learned approaches so as to provide a guidance for practitioners to decide what method to use under various practical scenarios.


2021 ◽  
Author(s):  
Srikanth Kandula ◽  
Laurel Orr ◽  
Surajit Chaudhuri
Keyword(s):  

Author(s):  
Parimarjan Negi ◽  
Matteo Interlandi ◽  
Ryan Marcus ◽  
Mohammad Alizadeh ◽  
Tim Kraska ◽  
...  
Keyword(s):  
Big Data ◽  

2021 ◽  
Vol Volume 34 - 2020 - Special... ◽  
Author(s):  
Simon Pierre Dembele ◽  
Ladjel Bellatreche ◽  
Carlos Ordonez ◽  
Nabil Gmati ◽  
Mathieu Roche ◽  
...  

Soumission à Episciences International audience Computers and electronic machines in businesses consume a significant amount of electricity, releasing carbon dioxide (CO2), which contributes to greenhouse gas emissions. Energy efficiency is a pressing concern in IT systems, ranging from mobile devices to large servers in data centers, in order to be more environmentally responsible. In order to meet the growing demands in the awareness of excessive energy consumption, many initiatives have been launched on energy efficiency for big data processing covering electronic components, software and applications. Query optimizers are one of the most power consuming components of a DBMS. They can be modified to take into account the energetical cost of query plans by using energy-based cost models with the aim of reducing the power consumption of computer systems. In this paper, we study, describe and evaluate the design of three energy cost models whose values of energy sensitive parameters are determined using the Nonlinear Regression and the Random Forests techniques. To this end, we study in depth the operating principle of the selected DBMS and present an analysis comparing the performance time and energy consumption of typical queries in the TPC benchmark. We perform extensive experiments on a physical testbed based on PostreSQL, MontetDB and Hyrise systems using workloads generatedusing our chosen benchmark to validate our proposal. Les ordinateurs et les machines électroniques des entreprises consomment une quantité importante d’électricité, libérant ainsi du dioxyde de carbone (CO2), qui contribue aux émissions de gaz à effet de serre. L’efficacité énergétique est une préoccupation urgente dans les systèmesinformatiques, partant des équipements mobiles aux grands serveurs dans les centres de données, afin d’être plus respectueux envers l’environnement. Afin de répondre aux exigences croissantes en matière de sensibilisation à l’utilisation excessive de l’énergie, de nombreuses initiatives ont été lancées sur l’efficacité énergétique pour le traitement des données massives couvrant les composantsélectroniques, les logiciels et les applications. Les optimiseurs de requêtes sont l’un des composants les plus énergivores d’un SGBD. Ils peuvent être modifiés pour prendre en compte le coût énergétique des plans des requêtes à l’aide des modèles de coût énergétiques intégrés dans l’optimiseur dans le but de réduire la consommation électrique des systèmes informatiques. Dans cet article, nousétudions, décrivons et évaluons la conception de trois modèles de coût énergétique dont les valeurs des paramètres sensibles à l’énergie sont définis en utilisant la technique de la Régression non linéaire et la technique des forêts aléatoires. Pour ce fait, nous menons une étude approfondie du principe de fonctionnement des SGBD choisis et présentons une analyse des performances en termes de temps et énergie sur des requêtes typiques du benchmarks TPC-H. Nous effectuons des expériences approfondies basées sur les systèmes PostgreSQL, MonetDB et Hyrise en utilisant un jeu de données généré à partir du benchmarks TPC-H afin de valider nos propositions.


Classical query optimizers rely on sophisticated cost models to estimate the cost of executing a query and its operators. By using this cost model, an efficient global plan is created by the optimizer which will be used to execute a given query. This cost modeling facility is difficult to be implemented in Web query engines because many local data sources might not be comfortable in sharing meta data information due to confidentiality issues. In this work, an efficient and effective cost modeling techniques for Web query engines are proposed. These techniques does not force the local data sources to reveal their meta data but employs a learning mechanism to estimate the cost of executing a given local query. Two cost modeling algorithms namely: Poisson cost model and Exponential cost model algorithms are presented. Empirical results over real world datasets reveal the efficiency and effectiveness of the new cost models.


2019 ◽  
Vol 13 (3) ◽  
pp. 348-361 ◽  
Author(s):  
Jyoti Leeka ◽  
Kaushik Rajan
Keyword(s):  
Big Data ◽  

2019 ◽  
Vol 23 (3) ◽  
pp. 2323-2345
Author(s):  
Simon Pierre Dembele ◽  
Ladjel Bellatreche ◽  
Carlos Ordonez ◽  
Amine Roukh
Keyword(s):  

Sign in / Sign up

Export Citation Format

Share Document