A Cost-Based Range Estimation for Mapping Top-k Selection Queries over Relational Databases

2009 ◽  
Vol 20 (4) ◽  
pp. 1-25 ◽  
Author(s):  
Anteneh Ayanso ◽  
Paulo B. Goes ◽  
Kumar Mehta

Finding efficient methods for supporting top-k relational queries has received significant attention in academic research. One of the approaches in the recent literature is query-mapping, in which top-k queries are mapped (translated) into equivalent range queries that relational database systems (RDBMSs) normally support. This approach combines the advantage of simplicity as well as practicality by avoiding the need for modifications to the query engine, or specialized data structures or indexing techniques to handle top-k queries separately. However, existing methods following this approach fall short of adequately modeling the problem environment and providing consistent results. In this article, the authors propose a cost-based range estimation model for the query-mapping approach. They provide a methodology for trading-off relevant query execution cost components and mapping a top-k query into a cost-optimal range query for efficient execution. Their experiments on real world and synthetic data sets show that the proposed strategy not only avoids the need to calibrate workloads on specific database contents, but also performs at least as well as prior methods.

Author(s):  
Anteneh Ayanso ◽  
Paulo B. Goes ◽  
Kumar Mehta

Relational databases have increasingly become the basis for a wide range of applications that require efficient methods for exploratory search and retrieval. Top-k retrieval addresses this need and involves finding a limited number of records whose attribute values are the closest to those specified in a query. One of the approaches in the recent literature is query-mapping which deals with converting top-k queries into equivalent range queries that relational database management systems (RDBMSs) normally support. This approach combines the advantages of simplicity as well as practicality by avoiding the need for modifications to the query engine, or specialized data structures and indexing techniques to handle top-k queries separately. This paper reviews existing query-mapping techniques in the literature and presents a range query estimation method based on cost modeling. Experiments on real world and synthetic data sets show that the cost-based range estimation method performs at least as well as prior methods and avoids the need to calibrate workloads on specific database contents.


Author(s):  
Kiryoong Kim ◽  
Dongkyu Kim ◽  
Jeuk Kim ◽  
Sang-uk Park ◽  
Ighoon Lee ◽  
...  

Electronic catalogs are electronic representations about products and services in the electronic commerce environment and require diverse and flexible schemas. Although relational database systems seem to be an obvious choice for their storage, traditional designs of relational schemas do not support electronic catalogs in the most effective ways. Therefore, new models for managing diverse and flexible schemas in relational databases are required for such systems. Proposed in this paper are several models for electronic catalogs using relational tables, and an experimental evaluation of their efficiency. The results of this study can be put to practical use and are, in fact, being applied in the design of a commercial software product.


2018 ◽  
Vol 8 ◽  
pp. 263-269
Author(s):  
Grzegorz Dziewit ◽  
Jakub Korczyński ◽  
Maria Skublewska-Paszkowska

Comparison of efficiency is not a trivial phenomenon because of disparities between different database systems. This paper presents a methodology of comparing relational database systems in respect of mean time of execution individual DML queries containing subqueries and conjunction of tables. The presented methodology can be additionally accommodated to studies of efficiency in a range of database system itself (study of queries executed directly in database engine). The described methodology allows to receive statement telling which database system is better in comparison to another in dependency of functionalities fulfilled by external application. In the article the analysis of mean time of execution individual DML queries was performed.Two research hypotheses have been put forward: "Microsoft SQL Server database system needs less time to execute INSERT and UPDATE queries than Oracle database" and "Oracle database system needs less time to execute DML queries with binary data than SQL Server"


Author(s):  
Gábor Szárnyas ◽  
János Maginecz ◽  
Dániel Varró

The last decade brought considerable improvements in distributed storage and query technologies, known as NoSQL systems. These systems provide quick evaluation of simple retrieval operations and are able to answer certain complex queries in a scalable way, albeit not instantly. Providing scalability and quick response times at the same time for querying large data sets is still a challenging task. Evaluating complex graph queries is particularly difficult, as it requires lots of join, antijoin and filtering operations. This paper presents optimization techniques used in relational database systems and applies them on graph queries. We evaluate various query plans on multiple datasets and discuss the effect of different optimization techniques.


Author(s):  
Andreas Meier ◽  
Günter Schindler ◽  
Nicolas Werro

In practice, information systems are based on very large data collections mostly stored in relational databases. As a result of information overload, it has become increasingly difficult to analyze huge amounts of data and to generate appropriate management decisions. Furthermore, data are often imprecise because they do not accurately represent the world or because they are themselves imperfect. For these reasons, a context model with fuzzy classes is proposed to extend relational database systems. More precisely, fuzzy classes and linguistic variables and terms, together with appropriate membership functions, are added to the database schema. The fuzzy classification query language (fCQL) allows the user to formulate unsharp queries that are then transformed into appropriate SQL statements using the fCQL toolkit so that no migration of the raw data is needed. In addition to the context model with fuzzy classes, fCQL and its implementation are presented here, illustrated by concrete examples.


Author(s):  
Yangjun Chen

It is a general opinion that relational database systems are inadequate for manipulating composite objects that arise in novel applications such as Web and document databases (Abiteboul, Cluet, Christophides, Milo, Moerkotte & Simon, 1997; Chen & Aberer, 1998, 1999; Mendelzon, Mihaila & Milo, 1997; Zhang, Naughton, Dewitt, Luo & Lohman, 2001), CAD/ CAM, CASE, office systems and software management. Especially, when recursive relationships are involved, it is cumbersome to handle them in relational databases, which sets current relational systems far behind the navigational ones (Kuno & Rundensteiner, 1998; Lee & Lee, 1998). To overcome this problem, a lot of interesting graph encoding methods have been developed to mitigate the difficulty to some extent. In this article, we give a brief description of some important methods, including analysis and comparison of their space and time complexities.


2015 ◽  
Vol 6 (4) ◽  
pp. 1-19 ◽  
Author(s):  
Negin Keivani ◽  
Abdelsalam M. Maatuk ◽  
Shadi Aljawarneh ◽  
Muhammad Akhtar Ali

Object-relational technology provides a significant increase in scalability and flexibility over the traditional relational databases. The additional object-relational features are particularly satisfying for advanced database applications that relational database systems have experienced difficulties. The key factor to the success of object-relational database systems is their performance. This paper aims to review the promises of Object-Relational database systems, examine the reality, and how their promises may be fulfilled through unification with the relational technology. To investigate the performance implications of using object-relational relative to relational technology, the query-oriented BUCKY benchmark has been previously applied to an early object-relational database system, i.e., Illustra 97. This paper presents the results obtained from implementing and running the BUCKY benchmark on Oracle 10g. The results acquired from the work described in this paper are compared with the results obtained in BUCKY benchmark. This study throws light on the functionality of object-relational databases, where object-relational technology has made improvements but some limitations are identified as well. In general, the performance of relational supersedes that of object-relational database system.


Author(s):  
Mary Ann Malloy ◽  
Irena Mlynkova

As XML technologies have become a standard for data representation, it is inevitable to propose and implement efficient techniques for managing XML data. A natural alternative is to exploit tools and functions offered by relational database systems. Unfortunately, this approach has many detractors, especially due to inefficiency caused by structural differences between XML data and relations. But, on the other hand, relational databases represent a mature, verified and reliable technology for managing any kind of data including XML documents. In this chapter, the authors provide an overview and classification of existing approaches to XML data management in relational databases. They view the problem from both state-of-the-practice and state-of-the-art perspectives. The authors describe the current best known solutions, their advantages and disadvantages. Finally, they discuss some open issues and their possible solutions.


Author(s):  
Ullas Nambiar

A query against incomplete or imprecise data in a database1, or a query whose search conditions are imprecise can both result in answers that do not satisfy the query completely. Such queries can be broadly termed as imprecise queries. Today’s database systems are designed largely for precise queries against a database of precise and complete data. Range queries (e.g., Age BETWEEN 20 AND 30) and disjunctive queries (e.g., Name=“G. W. Bush” OR Name=“George Bush”) do allow for some imprecision in queries. However, these extensions to precise queries are unable to completely capture the expressiveness of an imprecise query. Supporting imprecise queries (e.g., Model like “Camry” and Price around “$15000”) over databases necessitates a system that integrates a similarity search paradigm over structured and semi-structured data. Today’s relational database systems, as they are designed to support precise queries against precise data, use such precise access support mechanisms as indexing, hashing, and sorting. Such mechanisms are used for fast selective searches of records within a table and for joining two tables based on precise matching of values in join fields in the tables. The imprecise nature of the search conditions in queries will make such access mechanisms largely useless. Thus, supporting imprecise queries over existing databases would require adding support for imprecision within the query engine and meta-data management schemes like indexes. Extending a database to support imprecise queries would involve changing the query processing and data storage models being used by the database. But, the fact that databases are generally used by other applications and therefore must retain their behaviour could become a key inhibitor to any technique that relies on modifying the database to enable support for imprecision. For example, changing an airline reservation database will necessitate changes to other connected systems including travel agency databases, partner airline databases etc. Even if the database is modifiable, we would still require a domain expert and/or end user to provide the necessary distance metrics and domain ontology. Domain ontologies do not exist for all possible domains and the ones that are available are far from being complete. Therefore, a feasible solution for answering imprecise queries should neither assume the ability to modify the properties of the database nor require users (both lay and expert) to provide much domain specific information.


Author(s):  
Karthikeyan Ramasamy ◽  
Prasad M. Deshpande

About three decades ago, when Codd (1970) invented the relational database model, it took the database world by storm. The enterprises that adapted it early won a large competitive edge. The past two decades have witnessed tremendous growth of relational database systems, and today the relational model is by far the dominant data model and is the foundation for leading DBMS products, including IBM DB2, Informix, Oracle, Sybase, and Microsoft SQL server. Relational databases have become a multibillion-dollar industry.


Sign in / Sign up

Export Citation Format

Share Document