Aggregate Query Rewriting in Multidimensional Databases

Author(s):  
Leonardo Tininini

An efficient query engine is certainly one of the most important components in data warehouses (also known as OLAP systems or multidimensional databases) and its efficiency is influenced by many other aspects, both logical (data model, policy of view materialization, etc.) and physical (multidimensional or relational storage, indexes, etc). As is evident, OLAP queries are often based on the usual metaphor of the data cube and the concepts of facts, measures and dimensions and, in contrast to conventional transactional environments, they require the classification and aggregation of enormous quantities of data. In spite of that, one of the fundamental requirements for these systems is the ability to perform multidimensional analyses in online response times. Since the evaluation from scratch of a typical OLAP aggregate query may require several hours of computation, this can only be achieved by pre-computing several queries, storing the answers permanently in the database and then reusing them in the query evaluation process. These pre-computed queries are commonly referred to as materialized views and the problem of evaluating a query by using (possibly only) these precomputed results is known as the problem of answering/rewriting queries using views. In this paper we briefly analyze the difference between query answering and query rewriting approach and why query rewriting is preferable in a data warehouse context. We also discuss the main techniques proposed in literature to rewrite aggregate multidimensional queries using materialized views.

2003 ◽  
pp. 252-281
Author(s):  
Leonardo Tininini

A powerful and easy-to-use querying environment is certainly one of the most important components in a multidimensional database, and its effectiveness is influenced by many other aspects, both logical (data model, integration, policy of view materialization, etc.) and physical (multidimensional or relational storage, indexes, etc.). As is evident, multidimensional querying is often based on the metaphor of the data cube and on the concepts of facts, measures, and dimensions. In contrast to conventional transactional environments, multidimensional querying is often an exploratory process, performed by navigating along the dimensions and measures, increasing/decreasing the level of detail and focusing on specific subparts of the cube that appear to be “promising” for the required information. In this chapter we focus on the main languages proposed in the literature to express multidimensional queries, particularly those based on: (i) an algebraic approach, (ii) a declarative paradigm (calculus), and (iii) visual constructs and syntax. We analyze the problem of evaluation, i.e., the issues related to the efficient data retrieval and calculation, possibly (often necessarily) using some pre-computed data, a problem known in the literature as the problem of rewriting a query using views. We also illustrate the use of particular index structures to speed up the query evaluation process.


2012 ◽  
Author(s):  
Guy E. Hawkins ◽  
Birte U. Forstmann ◽  
Eric-Jan Wagenmakers ◽  
Scott D. Brown

Author(s):  
Geir Mjøen ◽  
Umberto Maggiore ◽  
Nicos Kessaris ◽  
Diederik Kimenai ◽  
Bruno Watschinger ◽  
...  

Abstract Background Publications from the last decade have increased knowledge regarding long-term risks after kidney donation. We wanted to perform a survey to assess how transplant professionals in Europe inform potential kidney donors regarding long-term risks. The objectives of the survey were to determine how they inform donors and to what extent, and to evaluate the degree of variation. Methods All transplant professionals involved in the evaluation process were considered eligible, regardless of the type of profession. The survey was dispatched as a link to a web-based survey. The subjects included questions on demographics, the information policy of the respondent and the use of risk calculators, including the difference of relative and absolute risks and how the respondents themselves understood these risks. Results The main finding was a large variation in how often different long-term risks were discussed with the potential donors, i.e. from always to never. Eighty percent of respondents stated that they always discuss the risk of end-stage renal disease, while 56% of respondents stated that they always discuss the risk of preeclampsia. Twenty percent of respondents answered correctly regarding the relationship between absolute and relative risks for rare outcomes. Conclusions The use of written information and checklists should be encouraged. This may improve standardization regarding the information provided to potential living kidney donors in Europe. There is a need for information and education among European transplant professionals regarding long-term risks after kidney donation and how to interpret and present these risks.


Author(s):  
Romain Perriot ◽  
Laurent d’Orazio ◽  
Dominique Laurent ◽  
Nicolas Spyratos

Author(s):  
Leonardo Tininini

This paper reviews the main techniques for the efficient calculation of aggregate multidimensional views and data cubes, possibly using specifically designed indexing structures. The efficient evaluation of aggregate multidimensional queries is obviously one of the most important aspects in data warehouses (OLAP systems). In particular, a fundamental requirement of such systems is the ability to perform multidimensional analyses in online response times. As multidimensional queries usually involve a huge amount of data to be aggregated, the only way to achieve this is by pre-computing some queries, storing the answers permanently in the database and reusing these almost exclusively when evaluating queries in the multidimensional database. These pre-computed queries are commonly referred to as materialized views and carry several related issues, particularly how to efficiently compute them (the focus of this paper), but also which views to materialize and how to maintain them.


1984 ◽  
Vol 54 (1) ◽  
pp. 83-90 ◽  
Author(s):  
R. Kim Guenther

English-Persian bilingual subjects decided whether probes were true or false for some previously studied discourses written in either English or Persian. The response times to probes depicting explicit events were much faster when the probes were written in the same language as their discourses than when they were written in the other language. However, for probes depicting implicit or false events, the difference between response times to probes written in the same language and response times to probes written in the other language was substantially less. The results were consistent with the hypothesis that the form of the stored representation of the meaning of discourse is the same for both languages of the bilingual person.


2012 ◽  
Vol 2012 ◽  
pp. 1-16 ◽  
Author(s):  
Dirk Borghys ◽  
Ingebjørg Kåsen ◽  
Véronique Achard ◽  
Christiaan Perneel

Anomaly detection (AD) in hyperspectral data has received a lot of attention for various applications. The aim of anomaly detection is to detect pixels in the hyperspectral data cube whose spectra differ significantly from the background spectra. Many anomaly detectors have been proposed in the literature. They differ in the way the background is characterized and in the method used for determining the difference between the current pixel and the background. The most well-known anomaly detector is the RX detector that calculates the Mahalanobis distance between the pixel under test (PUT) and the background. Global RX characterizes the background of the complete scene by a single multivariate normal probability density function. In many cases, this model is not appropriate for describing the background. For that reason a variety of other anomaly detection methods have been developed. This paper examines three classes of anomaly detectors: subspace methods, local methods, and segmentation-based methods. Representative examples of each class are chosen and applied on a set of hyperspectral data with diverse complexity. The results are evaluated and compared.


2014 ◽  
Vol 2014 ◽  
pp. 1-10
Author(s):  
Neal M. Bengtson

The technique of operational analysis (OA) is used in the study of systems performance, mainly for estimating mean values of various measures of interest, such as, number of jobs at a device and response times. The basic principles of operational analysis allow errors in assumptions to be quantified over a time period. The assumptions which are used to derive the operational analysis relationships are studied. Using Karush-Kuhn-Tucker (KKT) conditions bounds on error measures of these OA relationships are found. Examples of these bounds are used for representative performance measures to show limits on the difference between true performance values and those estimated by operational analysis relationships. A technique for finding tolerance limits on the bounds is demonstrated with a simulation example.


2020 ◽  
Author(s):  
Farshad Rafiei ◽  
Dobromir Rahnev

It is often thought that the diffusion model explains all effects related to the speed-accuracy tradeoff (SAT) but this has previously been examined with only a few SAT conditions or only a few subjects. Here we collected data from 20 subjects who performed a perceptual discrimination task with five different difficulty levels and five different SAT conditions (5,000 trials/subject). We found that the five SAT conditions produced robustly U-shaped curves for (i) the difference between error and correct response times (RTs), (ii) the ratio of the standard deviation and mean of the RT distributions, and (iii) the skewness of the RT distributions. Critically, the diffusion model where only drift rate varies with contrast and only boundary varies with SAT could not account for any of the three U-shaped curves. Further, allowing all parameters to vary across conditions revealed that both the SAT and difficulty manipulations resulted in substantial modulations in every model parameter, while still providing imperfect fits to the data. These findings demonstrate that the diffusion model cannot fully explain the effects of SAT and establishes three robust but challenging effects that models of SAT should account for.


Sign in / Sign up

Export Citation Format

Share Document