Dynamic Multidimensional Data Cubes

Data cubes are ubiquitous tools in data warehousing, online analytical processing, and decision support applications. Based on a selection of pre-computed and materialized aggregate values, they can dramatically speed up aggregation and summarization over large data collections. Traditionally, the emphasis has been on lowering query costs with little regard to maintenance, i.e., update cost issues. We argue that current trends require data cubes to be not only query-efficient, but also dynamic at the same time, and we also show how this can be achieved. Several array-based techniques with different tradeoffs between query and update cost are discussed in detail. We also survey selected approaches for sparse data and the popular data cube operator, CUBE. Moreover, this work includes an overview of future trends and their impact on data cubes.

Download Full-text

Navigation Rules for Exploring Large Multidimensional Data Cubes

Data Warehousing and Mining ◽

10.4018/978-1-59904-951-9.ch076 ◽

2008 ◽

pp. 1334-1354

Author(s):

Navin Kumar ◽

Aryya Gangopadhyay ◽

George Karabatis ◽

Sanjay Bapna ◽

Zhiyuan Chen

Keyword(s):

Knowledge Discovery ◽

Multidimensional Data ◽

Discovery Process ◽

High Recall ◽

Data Cubes ◽

Driven System ◽

On Line ◽

Analytical Processing ◽

Multidimensional Data Cubes ◽

Drill Down

Navigating through multidimensional data cubes is a nontrivial task. Although On-Line Analytical Processing (OLAP) provides the capability to view multidimensional data through rollup, drill-down, and slicing-dicing, it offers minimal guidance to end users in the actual knowledge discovery process. In this article, we address this knowledge discovery problem by identifying novel and useful patterns concealed in multidimensional data that are used for effective exploration of data cubes. We present an algorithm for the DIscovery of Sk-NAvigation Rules (DISNAR), which discovers the hidden interesting patterns in the form of Sk-navigation rules using a test of skewness on the pairs of the current and its candidate drill-down lattice nodes. The rules then are used to enhance navigational capabilities, as illustrated by our rule-driven system. Extensive experimental analysis shows that the DISNAR algorithm discovers the interesting patterns with a high recall and precision with small execution time and low space overhead.

Download Full-text

Classification on Top of Data Cube

Web Mining ◽

10.4018/978-1-59140-414-9.ch009 ◽

2011 ◽

pp. 189-207

Author(s):

Lixin Fu

Keyword(s):

Relational Databases ◽

Large Data ◽

Original Data ◽

Data Cube ◽

Large Data Sets ◽

Data Sets ◽

Classification Models ◽

Data Cubes ◽

Analytical Processing ◽

Free Classification

Currently, data classification is either performed on data stored in relational databases or performed on data stored in flat files. The problem with these approaches is that for large data sets, they often need multiple scans of the original data and thus are often infeasible in many applications. In this chapter we propose to deploy classification on top of OLAP (online analytical processing) and data cube systems. First, we compute the statistics in various combinations of the attributes known as data cubes. The statistics are then used to derive classification models. In this way, we only scan the original data once, which improves the performance of classification significantly. Furthermore, our new classifier will provide “free” classification by eliminating the dominating I/O overhead of scanning the massive original data. An architecture that integrates database, data cube, and data mining is given and three new cube-based classifiers are presented and evaluated.

Download Full-text

Structuring and visualization of indicators in multidimensional data cubes

Informacionno-technologicheskij vestnik ◽

10.21499/2409-1650-2018-4-79-87 ◽

2018 ◽

pp. 79-87

Author(s):

E. E. Akimkina

Keyword(s):

Decision Making ◽

Data Processing ◽

Data Management ◽

Multidimensional Data ◽

End User ◽

Processing Technologies ◽

Data Cubes ◽

Multidimensional Visualization ◽

Multidimensional Data Cubes ◽

Practical Recommendations

The problems of structuring of indicators in multidimensional data cubes with their subsequent processing with the help of end-user tools providing multidimensional visualization and data management are analyzed; the possibilities of multidimensional data processing technologies for managing and supporting decision making at a design and technological enterprise are shown; practical recommendations on the use of domestic computer environments for the structuring and visualization of multidimensional data cubes are given.

Download Full-text

Querying Multidimensional Data

Multidimensional Databases ◽

10.4018/978-1-59140-053-0.ch009 ◽

2003 ◽

pp. 252-281

Author(s):

Leonardo Tininini

Keyword(s):

Algebraic Approach ◽

Evaluation Process ◽

Data Retrieval ◽

Data Cube ◽

Multidimensional Data ◽

Integration Policy ◽

Multidimensional Database ◽

Efficient Data ◽

Speed Up ◽

Logical Data

A powerful and easy-to-use querying environment is certainly one of the most important components in a multidimensional database, and its effectiveness is influenced by many other aspects, both logical (data model, integration, policy of view materialization, etc.) and physical (multidimensional or relational storage, indexes, etc.). As is evident, multidimensional querying is often based on the metaphor of the data cube and on the concepts of facts, measures, and dimensions. In contrast to conventional transactional environments, multidimensional querying is often an exploratory process, performed by navigating along the dimensions and measures, increasing/decreasing the level of detail and focusing on specific subparts of the cube that appear to be “promising” for the required information. In this chapter we focus on the main languages proposed in the literature to express multidimensional queries, particularly those based on: (i) an algebraic approach, (ii) a declarative paradigm (calculus), and (iii) visual constructs and syntax. We analyze the problem of evaluation, i.e., the issues related to the efficient data retrieval and calculation, possibly (often necessarily) using some pre-computed data, a problem known in the literature as the problem of rewriting a query using views. We also illustrate the use of particular index structures to speed up the query evaluation process.

Download Full-text

Improved Data Partitioning for Building Large ROLAP Data Cubes in Parallel

Data Warehousing and Mining ◽

10.4018/978-1-59904-951-9.ch201 ◽

2008 ◽

pp. 3176-3193

Author(s):

Ying Chen ◽

Frank Dehne ◽

Todd Eavis ◽

A. Rau-Chaplin

Keyword(s):

High Dimensional Data ◽

Large Data ◽

Data Partitioning ◽

Data Cube ◽

High Dimensional ◽

Parallel Method ◽

Data Cubes ◽

Linear Speedup ◽

Optimal Linear ◽

Partitioning Technique

This paper presents an improved parallel method for generating ROLAP data cubes on a shared-nothing multiprocessor based on a novel optimized data partitioning technique. Since no shared disk is required, our method can be used for highly scalable processor clusters consisting of standard PCs with local disks only, connected via a data switch. Experiments show that our improved parallel method provides optimal, linear, speedup for at least 32 processors. The approach taken, which uses a ROLAP representation of the data cube, is well suited for large data warehouses and high dimensional data, and supports the generation of both fully materialized and partially materialized data cubes.

Download Full-text

A Hierarchy-Driven Compression Technique for Advanced OLAP Visualization of Multidimensional Data Cubes

Data Warehousing and Knowledge Discovery - Lecture Notes in Computer Science ◽

10.1007/11823728_11 ◽

2006 ◽

pp. 106-119 ◽

Cited By ~ 17

Author(s):

Alfredo Cuzzocrea ◽

Domenico Saccà ◽

Paolo Serafino

Keyword(s):

Multidimensional Data ◽

Compression Technique ◽

Data Cubes ◽

Multidimensional Data Cubes

Download Full-text

Modeling Multidimensional Data Cubes Based on MDA (Model-Driven Architecture)

Advanced Computational Methods for Knowledge Engineering - Advances in Intelligent Systems and Computing ◽

10.1007/978-3-319-38884-7_7 ◽

2016 ◽

pp. 85-97

Author(s):

Truong Dinh Huy ◽

Nguyen Thanh Binh ◽

Ngo Sy Ngoc

Keyword(s):

Multidimensional Data ◽

Model Driven Architecture ◽

Data Cubes ◽

Model Driven ◽

Multidimensional Data Cubes

Download Full-text

Integrating Open Data Cube and Brazil Data Cube Platforms for Land Use and Cover Classifications

Revista Brasileira de Cartografia ◽

10.14393/rbcv73n4-60387 ◽

2021 ◽

Vol 73 (4) ◽

pp. 1036-1047

Author(s):

Felipe Menino Carlos ◽

Vitor Conrado Faria Gomes ◽

Gilberto Ribeiro de Queiroz ◽

Felipe Carvalho de Souza ◽

Karine Reis Ferreira ◽

...

Keyword(s):

Machine Learning ◽

Land Use ◽

Time Series ◽

Satellite Image ◽

Open Data ◽

Data Cube ◽

Machine Learning Techniques ◽

Multidimensional Data ◽

Temporal Dimension ◽

Data Cubes

The potential to perform spatiotemporal analysis of the Earth's surface, fostered by a large amount of Earth Observation (EO) open data provided by space agencies, brings new perspectives to create innovative applications. Nevertheless, these big datasets pose some challenges regarding storage and analytical processing capabilities. The organization of these datasets as multidimensional data cubes represents the state-of-the-art in analysis-ready data regarding information extraction. EO data cubes can be defined as a set of time-series images associated with spatially aligned pixels along the temporal dimension. Some key technologies have been developed to take advantage of the data cube power. The Open Data Cube (ODC) framework and the Brazil Data Cube (BDC) platform provide capabilities to access and analyze EO data cubes. This paper introduces two new tools to facilitate the creation of land use and land over (LULC) maps using EO data cubes and Machine Learning techniques, and both built on top of ODC and BDC technologies. The first tool is a module that extends the ODC framework capabilities to lower the barriers to use Machine Learning (ML) algorithms with EO data. The second tool relies on integrating the R package named Satellite Image Time Series (sits) with ODC to enable the use of the data managed by the framework. Finally, water mask classification and LULC mapping applications are presented to demonstrate the processing capabilities of the tools.

Download Full-text

Explanation in OLAP Data Cubes

Journal of Information Technology Research ◽

10.4018/jitr.2014100105 ◽

2014 ◽

Vol 7 (4) ◽

pp. 63-78 ◽

Cited By ~ 1

Author(s):

Rahhal Errattahi ◽

Mohammed Fakir ◽

Fatima Zahra Salmam

Keyword(s):

Data Mining ◽

Association Rules ◽

Data Structures ◽

Relevant Information ◽

Multidimensional Data ◽

New Approach ◽

Data Cubes ◽

Multidimensional Data Structures ◽

Analytical Processing ◽

Interactive Data

OLAP is an important technology that offers a fast and interactive data navigation, it also provides tools to explore data cubes in order to extract interesting information from a multidimensional data structures. However, the OLAP exploration is done manually, without tools that could automatically extract relevant information from the cube. In addition OLAP is not capable of explaining relationships that could exist within data. This paper presents a new approach to coupling between data mining and online analytical processing. Its approach provides the explanation in OLAP data cubes by using the association rules between the inter-dimensional predicates. The mining process could be done by one of the two algorithms, Apriori and Fp-Growth, in which aggregate measures to calculate support and confidence are exploited. It also evaluates the interestingness of mined association rules according to the Lift criteria.

Download Full-text

Semantics-Aware Advanced OLAP Visualization of Multidimensional Data Cubes

Data Warehousing and Mining ◽

10.4018/978-1-59904-951-9.ch057 ◽

2008 ◽

pp. 974-1003 ◽

Cited By ~ 1

Author(s):

Alfredo Cuzzocrea ◽

Domenico Sacca ◽

Paolo Serafino

Keyword(s):

Data Warehouse ◽

Large Family ◽

Research Topic ◽

Biological Data ◽

Multidimensional Data ◽

Data Cubes ◽

Efficient Data ◽

Spatio Temporal ◽

Multidimensional Data Cubes ◽

Flattening Process

Efficiently supporting advanced OLAP visualization of multidimensional data cubes is a novel and challenging research topic, which results to be of interest for a large family of data warehouse applications relying on the management of spatio-temporal (e.g., mobile) data, scientific and statistical data, sensor network data, biological data, etc. On the other hand, the issue of visualizing multidimensional data domains has been quite neglected from the research community, since it does not belong to the well-founded conceptual-logical-physical design hierarchy inherited from relational database methodologies. Inspired from these considerations, in this article we propose an innovative advanced OLAP visualization technique that meaningfully combines (i) the so-called OLAP dimension flattening process, which allows us to extract two-dimensional OLAP views from multidimensional data cubes, and (ii) very efficient data compression techniques for such views, which allow us to generate “semantics-aware” compressed representations where data are grouped along OLAP hierarchies.

Download Full-text