Multidimensional Databases
Latest Publications


TOTAL DOCUMENTS

13
(FIVE YEARS 0)

H-INDEX

5
(FIVE YEARS 0)

Published By IGI Global

9781591400530, 9781591400868

2003 ◽  
pp. 282-309 ◽  
Author(s):  
Cirtis E. Dyreson ◽  
Torben Bach Pedersen ◽  
Christian S. Jensen

While incomplete information is endemic to real-world data, current multidimensional data models are not engineered to manage incomplete information in base data, derived data, and dimensions. This chapter presents several strategies for managing incomplete information in multidimensional databases. Which strategy to use is dependent on the kind of incomplete information present, and also on where it occurs in the multidimensional database. A relatively simple strategy is to replace incomplete information with appropriate, complete information. The advantage of this strategy is that all multidimensional databases can manage complete information. Other strategies require more substantial changes to the multidimensional database. One strategy is to reflect the incompleteness in computed aggregates, which is possible only if the multidimensional database allows incomplete values in its hierarchies. Another strategy is to measure the amount of incompleteness in aggregated values by tallying how much uncertain information went into their production.


2003 ◽  
pp. 252-281
Author(s):  
Leonardo Tininini

A powerful and easy-to-use querying environment is certainly one of the most important components in a multidimensional database, and its effectiveness is influenced by many other aspects, both logical (data model, integration, policy of view materialization, etc.) and physical (multidimensional or relational storage, indexes, etc.). As is evident, multidimensional querying is often based on the metaphor of the data cube and on the concepts of facts, measures, and dimensions. In contrast to conventional transactional environments, multidimensional querying is often an exploratory process, performed by navigating along the dimensions and measures, increasing/decreasing the level of detail and focusing on specific subparts of the cube that appear to be “promising” for the required information. In this chapter we focus on the main languages proposed in the literature to express multidimensional queries, particularly those based on: (i) an algebraic approach, (ii) a declarative paradigm (calculus), and (iii) visual constructs and syntax. We analyze the problem of evaluation, i.e., the issues related to the efficient data retrieval and calculation, possibly (often necessarily) using some pre-computed data, a problem known in the literature as the problem of rewriting a query using views. We also illustrate the use of particular index structures to speed up the query evaluation process.


2003 ◽  
pp. 91-115 ◽  
Author(s):  
Elaheh Pourabbas ◽  
Maurizio Rafanelli

In this chapter we will focus on the rules of aggregation hierarchies in analysis dimensions of a cube. We give an overview of the related works on the basic concepts of the different types of aggregation hierarchies. We then discuss the hierarchies from two different points of view: mapping between domain values and hierarchical structures. In relation to them, we introduce the characterization of some OLAP operators on hierarchies and give a set of operators that concern the change in the hierarchy structure. Finally, we propose an enlargement of the operator set concerning hierarchies.


2003 ◽  
pp. 1-45
Author(s):  
Maurizio Rafanelli

This chapter presents the basic notions regarding multidimensional (aggregate) databases by referring to different definitions given for them in the literature. It illustrates the important concepts of micro, macro, and metadata; presents a formal definition of the aggregation process, discussing the concepts of dimension and dimension hierarchies; describes the multidimensional aggregate data structure, distinguishing between simple, complex, and composite structure; illustrates the different types of null values; and discusses differences and similarities which exist between multidimensional aggregate data (generally called statistical data because they are used mainly by statisticians) and the On-Line-Analytic Processing (OLAP) of multidimensional data represented by different data cubes, also discussing the different (symmetric and non-symmetric) treatment of dimensions and measures required by OLAP and aggregate multidimensional databases. Finally it discusses a graph model and a tabular model for this kind of data, and gives a set of definitions regarding the OLAP terminology.


2003 ◽  
pp. 361-392 ◽  
Author(s):  
Andrea Cali ◽  
Domenico Lembo ◽  
Maurizio Lenzerini ◽  
Riccardo Rosati

While the main goal of a data warehouse is to provide support for data analysis and management’s decisions, a fundamental aspect in design of a data warehouse system is the process of acquiring the raw data from a set of relevant information sources. We will call source integration system the component of a data warehouse system dealing with this process. The main goal of a source integration system is to deal with the transfer of data from the set of sources constituting the application-oriented operational environment, to the data warehouse. Since sources are typically autonomous, distributed, and heterogeneous, this task has to deal with the problem of cleaning, reconciling, and integrating data coming from the sources. The design of a source integration system is a very complex task, which comprises several different issues. The purpose of this chapter is to discuss the most important problems arising in the design of a source integration system, with special emphasis on schema integration, processing queries for data integration, and data cleaning and reconciliation.


2003 ◽  
pp. 46-68 ◽  
Author(s):  
Arie Shoshani

The term “multidimensional databases” refers to data that can be viewed conceptually in a multidimensional space, where each dimension represents some attributes of the data. Viewing data in this form is natural for many applications, yet the concepts are not treated in a uniform way in the database literature. In this chapter, we show the commonality of concepts between three database areas: statistical, OLAP, and scientific databases. We show that these domains have two main structural concepts: the cross-product space of the dimensions, and the classification hierarchy structure associated with each dimension. In the first part of this chapter we describe how these structures are sed to represent data in statistical and OLAP databases and how summarization operators can be applied to them. Further, we discuss how these structures can be extended to represent related information using federated database concepts. In the second part of the chapter we show that these concepts are common to many scientific database applications. In particular, we discuss the importance of supporting classification structures and the difficulty in representing them as tables in relational databases. We also discuss data structures to support multidimensional databases, emphasizing space-time representation, clustering in multidimensional space, indexing in multidimensional space, and supporting classification structures. We conclude by arguing that the concepts of multidimensionality and classification structures as well as the operation over them should be elevated to “first class” object types. These object types should be visible by the application user explicitly in the conceptual schemas as well as exposing them in the user interfaces.


2003 ◽  
pp. 222-251 ◽  
Author(s):  
Stefano Paraboschi ◽  
Giuseppe Sindoni ◽  
Elena Baralis ◽  
Ernst Teniente

This chapter presents materialized views in the context of multidimensional databases (MDDBs). A materialized view is a view whose content is explicitly stored in the database. The advantage of materializing views is that it is not necessary to recompute the query every time the view is accessed. The shortcoming is that it has to be kept consistent with the updates on the base tables. However, efficient incremental maintenance techniques have been proposed. MDDBs are an ideal environment for materialized views because frequency of updates is low, MDDB data models permit easy adoption of incremental maintenance, and queries can be modeled in such a way to allow an easy definition of the view selection problem, i.e., the problem of selecting which query to materialize in an MDDB. Hence, we present the problems of choosing and maintaining materialised views with the corresponding solutions.


2003 ◽  
pp. 200-221 ◽  
Author(s):  
Mirek Riedewald ◽  
Divyakant Agrawal ◽  
Amr El Abbadi

Data cubes are ubiquitous tools in data warehousing, online analytical processing, and decision support applications. Based on a selection of pre-computed and materialized aggregate values, they can dramatically speed up aggregation and summarization over large data collections. Traditionally, the emphasis has been on lowering query costs with little regard to maintenance, i.e., update cost issues. We argue that current trends require data cubes to be not only query-efficient, but also dynamic at the same time, and we also show how this can be achieved. Several array-based techniques with different tradeoffs between query and update cost are discussed in detail. We also survey selected approaches for sparse data and the popular data cube operator, CUBE. Moreover, this work includes an overview of future trends and their impact on data cubes.


2003 ◽  
pp. 166-199 ◽  
Author(s):  
Alberto O. Mendelzon ◽  
Alejandro A. Vaisman

In spite of the obvious importance of time in data warehousing and OLAP, current commercial systems do not support tracking the history of a data warehouse, either at the schema or instance level. In this chapter we address this issue, introducing the Temporal Multidimensional Model and a query language, denoted TOLAP, allowing expressing temporal OLAP queries at a high level of abstraction. Further, we show that previous work in temporal databases needs to be extended in order to handle evolution and versioning in OLAP. Finally, we present an implementation, along with preliminary experimental results.


2003 ◽  
pp. 116-165 ◽  
Author(s):  
Maurizio Rafanelli

In this chapter the author proposes the different approaches for defining operators able to manipulate this multidimensional structure. In particular, he initially considers operators for multidimensional aggregate data which extend relational algebra and relational calculus (the so-called enlarged relational model). Then he discusses operators for multidimensional aggregate data defined in a tabular environment. In both the cases the author defines such data as statistical (aggregate) data. Subsequently he introduces the operators for OLAP applications, giving a terminology correspondence between the multidimensional aggregate (statistical) databases and OLAP areas. Then he defines the fundamental operators deduced from the previous ones, which form the basic algebra for the manipulation of multidimensional aggregate data, giving their formal definitions and some explanatory examples.


Sign in / Sign up

Export Citation Format

Share Document