Advanced Query Optimization

Author(s):

Antonio Badia

Keyword(s):

Query Optimization ◽

Cost Optimization ◽

Relational Algebra ◽

Materialized Views ◽

Benchmark Database ◽

Join Queries ◽

Relational Systems ◽

Foreign Key ◽

Complementary Techniques ◽

Query optimization has been an active area of research ever since the first relational systems were implemented. In the last few years, research in the area has experienced renewed impulse, thanks to new development like data warehousing. In this article, we overview some of the recent advances made in complex query optimization. This article assumes knowledge of SQL and relational algebra, as well as the basics of query processing; in particular, the user is assumed to understand cost optimization of basic SQL blocks (Select-Project-Join queries). After explaining the basic unnesting approach to provide some background, we overview three complementary techniques: source and algebraic transformations (in particular, moving outerjoins and pushing down aggregates), query rewrite (materialized views), new indexing techniques (bitmap and join indices), and different methods to build the answer (online aggregation and sampling). Throughout this article, we will use subquery to refer to a nested SQL query and outer query to refer to an SQL query that contains a nested query. The TPCH benchmark database (TPC, n.d.) is used as a source of examples. This database includes (in ascending size order) tables Nation, Customer (with a foreign key for Nation), Order (with a foreign key for Customer), and Lineitem (with a foreign key for Order). All attributes from a table are prefixed with the initial of the table’s name (“c_” for Customer, and so on).

SQL query optimization through nested relational algebra

ACM Transactions on Database Systems ◽

10.1145/1272743.1272748 ◽

2007 ◽

Vol 32 (3) ◽

pp. 18 ◽

Cited By ~ 3

Author(s):

Bin Cao ◽

Antonio Badia

Keyword(s):

Query Optimization ◽

Relational Algebra ◽

MPP SQL Query Optimization with RTCG

Big Data Analytics - Lecture Notes in Computer Science ◽

10.1007/978-3-030-04780-1_16 ◽

2018 ◽

pp. 228-249

Author(s):

K. T. Sridhar ◽

M. A. Sakkeer ◽

Shiju Andrews ◽

Jimson Johnson

Keyword(s):

Query Optimization ◽

SQL Query Optimization on Cross Nodes for Distributed System

DEStech Transactions on Environment Energy and Earth Science ◽

10.12783/dteees/peem2016/5029 ◽

2017 ◽

Author(s):

Feng ZHAO ◽

Qiao SUN ◽

Yan-bin JIAO ◽

Jia-song SUN

Keyword(s):

Distributed System ◽

Query Optimization ◽

A Genetic Algorithm for Selecting Horizontal Fragments

Encyclopedia of Data Warehousing and Mining, Second Edition ◽

10.4018/978-1-60566-010-3.ch142 ◽

2011 ◽

pp. 920-925

Author(s):

Ladjel Bellatreche

Keyword(s):

Database Systems ◽

Data Partitioning ◽

Materialized Views ◽

Access Path ◽

Multiple Dimensions ◽

Physical Database Design ◽

Join Queries ◽

Speed Up ◽

Disjoint Sets ◽

Horizontal Partitioning

Decision support applications require complex queries, e.g., multi way joins defining on huge warehouses usually modelled using star schemas, i.e., a fact table and a set of data dimensions (Papadomanolakis & Ailamaki, 2004). Star schemas have an important property in terms of join operations between dimensions tables and the fact table (i.e., the fact table contains foreign keys for each dimension). None join operations between dimension tables. Joins in data warehouses (called star join queries) are particularly expensive because the fact table (the largest table in the warehouse by far) participates in every join and multiple dimensions are likely to participate in each join. To speed up star join queries, many optimization structures were proposed: redundant structures (materialized views and advanced index schemes) and non redundant structures (data partitioning and parallel processing). Recently, data partitioning is known as an important aspect of physical database design (Sanjay, Narasayya & Yang, 2004; Papadomanolakis & Ailamaki, 2004). Two types of data partitioning are available (Özsu & Valduriez, 1999): vertical and horizontal partitioning. Vertical partitioning allows tables to be decomposed into disjoint sets of columns. Horizontal partitioning allows tables, materialized views and indexes to be partitioned into disjoint sets of rows that are physically stored and usually accessed separately. Contrary to redundant structures, data partitioning does not replicate data, thereby reducing storage requirement and minimizing maintenance overhead. In this paper, we concentrate only on horizontal data partitioning (HP). HP may affect positively (1) query performance, by performing partition elimination: if a query includes a partition key as a predicate in the WHERE clause, the query optimizer will automatically route the query to only relevant partitions and (2) database manageability: for instance, by allocating partitions in different machines or by splitting any access paths: tables, materialized views, indexes, etc. Most of database systems allow three methods to perform the HP using PARTITION statement: RANGE, HASH and LIST (Sanjay, Narasayya & Yang, 2004). In the range partitioning, an access path (table, view, and index) is split according to a range of values of a given set of columns. The hash mode decomposes the data according to a hash function (provided by the system) applied to the values of the partitioning columns. The list partitioning splits a table according to the listed values of a column. These methods can be combined to generate composite partitioning. Oracle currently supports range-hash and range-list composite partitioning using PARTITION - SUBPARTITION statement. The following SQL statement shows an example of fragmenting a table Student using range partitioning.

Recursive SQL Query Optimization with k-Iteration Lookahead

Lecture Notes in Computer Science - Database and Expert Systems Applications ◽

10.1007/11827405_34 ◽

2006 ◽

pp. 348-357 ◽

Cited By ~ 7

Author(s):

Ahmad Ghazal ◽

Alain Crolotte ◽

Dawit Seid

Keyword(s):

Query Optimization ◽

A Query Beehive Algorithm for Data Warehouse Buffer Management and Query Scheduling

International Journal of Data Warehousing and Mining ◽

10.4018/ijdwm.2014070103 ◽

2014 ◽

Vol 10 (3) ◽

pp. 34-58 ◽

Author(s):

Amira Kerkad ◽

Ladjel Bellatreche ◽

Pascal Richard ◽

Carlos Ordonez ◽

Dominique Geniet

Keyword(s):

Query Optimization ◽

Buffer Management ◽

Optimization Techniques ◽

Materialized Views ◽

Star Schema ◽

Joint Problem ◽

Fixed Order ◽

Pass Through ◽

Np Hardness ◽

Query Scheduling

Analytical queries, like those used in data warehouses and OLAP, are generally interdependent. This is due to the fact that the database is usually modeled with a denormalized star schema or its variants, where most queries pass through a large central fact table. Such interaction has been largely exploited in query optimization techniques such as materialized views. Nevertheless, such approaches usually ignore buffer management and assume queries have a fixed order and are known in advance. We believe such assumptions are too strong and thus they need to be revisited and simplified. In this paper, we study the combination of two problems: buffer management and query scheduling, in both static and dynamic scenarios. We present an NP-hardness study of the joint problem, highlighting its complexity. We then introduce a new and highly efficient algorithm inspired by a beehive. We conduct an extensive experimental evaluation on a real DBMS showing the superiority of our algorithm compared to previous ones as well as its excellent scalability.

SQL Query Optimization Methods of Relational Database System

2010 Second International Conference on Computer Engineering and Applications ◽

10.1109/iccea.2010.113 ◽

2010 ◽

Cited By ~ 2

Author(s):

Dandan Li ◽

Lu Han ◽

Yi Ding

Keyword(s):

Query Optimization ◽

Relational Database ◽

Optimization Methods ◽

Database System ◽

Relational Database System ◽

SQL query optimization in Content Based Image Retrieval systems

2016 International Conference on Communications (COMM) ◽

10.1109/iccomm.2016.7528344 ◽

2016 ◽

Author(s):

Nicoleta Angelescu ◽

Henri George Coanda ◽

Ion Caciula ◽

Catalin Dragoi ◽

Felix Albu

Keyword(s):

Image Retrieval ◽

Query Optimization ◽

Content Based Image Retrieval ◽

Retrieval Systems ◽

Exploiting inter-operation parallelism for SQL query optimization

Lecture Notes in Computer Science - Database and Expert Systems Applications ◽

10.1007/3-540-58435-8_242 ◽

1994 ◽

pp. 759-768 ◽

Author(s):

A. Hameurlain ◽

F. Morvan

Keyword(s):

Query Optimization ◽

Simulation for Generate Efficient Evaluation Plans with Materialized Views

Advanced Materials Research ◽

10.4028/www.scientific.net/amr.748.976 ◽

2013 ◽

Vol 748 ◽

pp. 976-981

Author(s):

Jia Chen ◽

Hong Wei Chen ◽

Xin Rong Hu

Keyword(s):

Query Optimization ◽

Efficient Algorithm ◽

Materialized Views ◽

Good Efficiency

We study the problem of finding efficient equivalent view-based rewritings of relational queries, focusing on query optimization using materialized views under the assumption that base relations cannot contain duplicate tuples. We propose efficient algorithm SRGP for finding approximate optimal rewriting. Our results apply not only to query optimization, but to all areas where the goals is to obtain efficient equivalent view-based query rewritings. Experiments show that our algorithm of generating optimal rewriting has good efficiency and scalability.