Dynamic Multidimensional Data Cubes

2003 ◽  
pp. 200-221 ◽  
Author(s):  
Mirek Riedewald ◽  
Divyakant Agrawal ◽  
Amr El Abbadi

Data cubes are ubiquitous tools in data warehousing, online analytical processing, and decision support applications. Based on a selection of pre-computed and materialized aggregate values, they can dramatically speed up aggregation and summarization over large data collections. Traditionally, the emphasis has been on lowering query costs with little regard to maintenance, i.e., update cost issues. We argue that current trends require data cubes to be not only query-efficient, but also dynamic at the same time, and we also show how this can be achieved. Several array-based techniques with different tradeoffs between query and update cost are discussed in detail. We also survey selected approaches for sparse data and the popular data cube operator, CUBE. Moreover, this work includes an overview of future trends and their impact on data cubes.

2008 ◽  
pp. 1334-1354
Author(s):  
Navin Kumar ◽  
Aryya Gangopadhyay ◽  
George Karabatis ◽  
Sanjay Bapna ◽  
Zhiyuan Chen

Navigating through multidimensional data cubes is a nontrivial task. Although On-Line Analytical Processing (OLAP) provides the capability to view multidimensional data through rollup, drill-down, and slicing-dicing, it offers minimal guidance to end users in the actual knowledge discovery process. In this article, we address this knowledge discovery problem by identifying novel and useful patterns concealed in multidimensional data that are used for effective exploration of data cubes. We present an algorithm for the DIscovery of Sk-NAvigation Rules (DISNAR), which discovers the hidden interesting patterns in the form of Sk-navigation rules using a test of skewness on the pairs of the current and its candidate drill-down lattice nodes. The rules then are used to enhance navigational capabilities, as illustrated by our rule-driven system. Extensive experimental analysis shows that the DISNAR algorithm discovers the interesting patterns with a high recall and precision with small execution time and low space overhead.


Web Mining ◽  
2011 ◽  
pp. 189-207
Author(s):  
Lixin Fu

Currently, data classification is either performed on data stored in relational databases or performed on data stored in flat files. The problem with these approaches is that for large data sets, they often need multiple scans of the original data and thus are often infeasible in many applications. In this chapter we propose to deploy classification on top of OLAP (online analytical processing) and data cube systems. First, we compute the statistics in various combinations of the attributes known as data cubes. The statistics are then used to derive classification models. In this way, we only scan the original data once, which improves the performance of classification significantly. Furthermore, our new classifier will provide “free” classification by eliminating the dominating I/O overhead of scanning the massive original data. An architecture that integrates database, data cube, and data mining is given and three new cube-based classifiers are presented and evaluated.


Author(s):  
E. E. Akimkina

The problems of structuring of indicators in multidimensional data cubes with their subsequent processing with the help of end-user tools providing multidimensional visualization and data management are analyzed; the possibilities of multidimensional data processing technologies for managing and supporting decision making at a design and technological enterprise are shown; practical recommendations on the use of domestic computer environments for the structuring and visualization of multidimensional data cubes are given.


2003 ◽  
pp. 252-281
Author(s):  
Leonardo Tininini

A powerful and easy-to-use querying environment is certainly one of the most important components in a multidimensional database, and its effectiveness is influenced by many other aspects, both logical (data model, integration, policy of view materialization, etc.) and physical (multidimensional or relational storage, indexes, etc.). As is evident, multidimensional querying is often based on the metaphor of the data cube and on the concepts of facts, measures, and dimensions. In contrast to conventional transactional environments, multidimensional querying is often an exploratory process, performed by navigating along the dimensions and measures, increasing/decreasing the level of detail and focusing on specific subparts of the cube that appear to be “promising” for the required information. In this chapter we focus on the main languages proposed in the literature to express multidimensional queries, particularly those based on: (i) an algebraic approach, (ii) a declarative paradigm (calculus), and (iii) visual constructs and syntax. We analyze the problem of evaluation, i.e., the issues related to the efficient data retrieval and calculation, possibly (often necessarily) using some pre-computed data, a problem known in the literature as the problem of rewriting a query using views. We also illustrate the use of particular index structures to speed up the query evaluation process.


2008 ◽  
pp. 3176-3193
Author(s):  
Ying Chen ◽  
Frank Dehne ◽  
Todd Eavis ◽  
A. Rau-Chaplin

This paper presents an improved parallel method for generating ROLAP data cubes on a shared-nothing multiprocessor based on a novel optimized data partitioning technique. Since no shared disk is required, our method can be used for highly scalable processor clusters consisting of standard PCs with local disks only, connected via a data switch. Experiments show that our improved parallel method provides optimal, linear, speedup for at least 32 processors. The approach taken, which uses a ROLAP representation of the data cube, is well suited for large data warehouses and high dimensional data, and supports the generation of both fully materialized and partially materialized data cubes.


2021 ◽  
Vol 73 (4) ◽  
pp. 1036-1047
Author(s):  
Felipe Menino Carlos ◽  
Vitor Conrado Faria Gomes ◽  
Gilberto Ribeiro de Queiroz ◽  
Felipe Carvalho de Souza ◽  
Karine Reis Ferreira ◽  
...  

The potential to perform spatiotemporal analysis of the Earth's surface, fostered by a large amount of Earth Observation (EO) open data provided by space agencies, brings new perspectives to create innovative applications. Nevertheless, these big datasets pose some challenges regarding storage and analytical processing capabilities. The organization of these datasets as multidimensional data cubes represents the state-of-the-art in analysis-ready data regarding information extraction. EO data cubes can be defined as a set of time-series images associated with spatially aligned pixels along the temporal dimension. Some key technologies have been developed to take advantage of the data cube power. The Open Data Cube (ODC) framework and the Brazil Data Cube (BDC) platform provide capabilities to access and analyze EO data cubes. This paper introduces two new tools to facilitate the creation of land use and land over (LULC) maps using EO data cubes and Machine Learning techniques, and both built on top of ODC and BDC technologies. The first tool is a module that extends the ODC framework capabilities to lower the barriers to use Machine Learning (ML) algorithms with EO data. The second tool relies on integrating the R package named Satellite Image Time Series (sits) with ODC to enable the use of the data managed by the framework. Finally, water mask classification and LULC mapping applications are presented to demonstrate the processing capabilities of the tools.


2014 ◽  
Vol 7 (4) ◽  
pp. 63-78 ◽  
Author(s):  
Rahhal Errattahi ◽  
Mohammed Fakir ◽  
Fatima Zahra Salmam

OLAP is an important technology that offers a fast and interactive data navigation, it also provides tools to explore data cubes in order to extract interesting information from a multidimensional data structures. However, the OLAP exploration is done manually, without tools that could automatically extract relevant information from the cube. In addition OLAP is not capable of explaining relationships that could exist within data. This paper presents a new approach to coupling between data mining and online analytical processing. Its approach provides the explanation in OLAP data cubes by using the association rules between the inter-dimensional predicates. The mining process could be done by one of the two algorithms, Apriori and Fp-Growth, in which aggregate measures to calculate support and confidence are exploited. It also evaluates the interestingness of mined association rules according to the Lift criteria.


2008 ◽  
pp. 974-1003 ◽  
Author(s):  
Alfredo Cuzzocrea ◽  
Domenico Sacca ◽  
Paolo Serafino

Efficiently supporting advanced OLAP visualization of multidimensional data cubes is a novel and challenging research topic, which results to be of interest for a large family of data warehouse applications relying on the management of spatio-temporal (e.g., mobile) data, scientific and statistical data, sensor network data, biological data, etc. On the other hand, the issue of visualizing multidimensional data domains has been quite neglected from the research community, since it does not belong to the well-founded conceptual-logical-physical design hierarchy inherited from relational database methodologies. Inspired from these considerations, in this article we propose an innovative advanced OLAP visualization technique that meaningfully combines (i) the so-called OLAP dimension flattening process, which allows us to extract two-dimensional OLAP views from multidimensional data cubes, and (ii) very efficient data compression techniques for such views, which allow us to generate “semantics-aware” compressed representations where data are grouped along OLAP hierarchies.


Sign in / Sign up

Export Citation Format

Share Document