Computation of OLAP Data Cubes

Encyclopedia of Data Warehousing and Mining, Second Edition ◽

10.4018/978-1-60566-010-3.ch046 ◽

2011 ◽

pp. 286-292

Author(s):

Amin A. Abdulghani

Keyword(s):

Data Structure ◽

Data Cube ◽

Primary Interest ◽

Total Cost ◽

Sales Data ◽

Data Cubes ◽

Multiple Dimensions ◽

Analytical Processing ◽

Product Location ◽

Total Sales

The focus of online analytical processing (OLAP) is to provide a platform for analyzing data (e.g., sales data) with multiple dimensions (e.g., product, location, time) and multiple measures (e.g., total sales or total cost). OLAP operations then allow viewing of this data from a number of perspectives. For analysis, the object or data structure of primary interest in OLAP is a data cube. A detailed introduction to OLAP is presented in (Han & Kambler, 2006).

Download Full-text

Computation of OLAP Cubes

Encyclopedia of Data Warehousing and Mining ◽

10.4018/978-1-59140-557-3.ch038 ◽

2011 ◽

pp. 196-201

Author(s):

Amin A. Abdulghani

Keyword(s):

Data Structure ◽

Online Analytical Processing ◽

Primary Interest ◽

Total Cost ◽

Sales Data ◽

Multiple Measures ◽

Multiple Dimensions ◽

Analytical Processing ◽

Product Location ◽

Total Sales

The focus of Online Analytical Processing (OLAP) is to provide a platform for analyzing data (e.g., sales data) with multiple dimensions (e.g., product, location, time) and multiple measures (e.g., total sales or total cost). OLAP operations then allow viewing of this data from a number of perspectives. For analysis, the object or data structure of primary interest in OLAP is a cube.

Download Full-text

Dynamic Multidimensional Data Cubes

Multidimensional Databases ◽

10.4018/978-1-59140-053-0.ch007 ◽

2003 ◽

pp. 200-221 ◽

Cited By ~ 1

Author(s):

Mirek Riedewald ◽

Divyakant Agrawal ◽

Amr El Abbadi

Keyword(s):

Large Data ◽

Data Cube ◽

Multidimensional Data ◽

Future Trends ◽

Data Cubes ◽

Current Trends ◽

Speed Up ◽

Analytical Processing ◽

Data Collections ◽

Multidimensional Data Cubes

Data cubes are ubiquitous tools in data warehousing, online analytical processing, and decision support applications. Based on a selection of pre-computed and materialized aggregate values, they can dramatically speed up aggregation and summarization over large data collections. Traditionally, the emphasis has been on lowering query costs with little regard to maintenance, i.e., update cost issues. We argue that current trends require data cubes to be not only query-efficient, but also dynamic at the same time, and we also show how this can be achieved. Several array-based techniques with different tradeoffs between query and update cost are discussed in detail. We also survey selected approaches for sparse data and the popular data cube operator, CUBE. Moreover, this work includes an overview of future trends and their impact on data cubes.

Download Full-text

Classification on Top of Data Cube

Web Mining ◽

10.4018/978-1-59140-414-9.ch009 ◽

2011 ◽

pp. 189-207

Author(s):

Lixin Fu

Keyword(s):

Relational Databases ◽

Large Data ◽

Original Data ◽

Data Cube ◽

Large Data Sets ◽

Data Sets ◽

Classification Models ◽

Data Cubes ◽

Analytical Processing ◽

Free Classification

Currently, data classification is either performed on data stored in relational databases or performed on data stored in flat files. The problem with these approaches is that for large data sets, they often need multiple scans of the original data and thus are often infeasible in many applications. In this chapter we propose to deploy classification on top of OLAP (online analytical processing) and data cube systems. First, we compute the statistics in various combinations of the attributes known as data cubes. The statistics are then used to derive classification models. In this way, we only scan the original data once, which improves the performance of classification significantly. Furthermore, our new classifier will provide “free” classification by eliminating the dominating I/O overhead of scanning the massive original data. An architecture that integrates database, data cube, and data mining is given and three new cube-based classifiers are presented and evaluated.

Download Full-text

Little filtered cigars: US sales, flavours, package sizes and prices

Tobacco Control ◽

10.1136/tobaccocontrol-2018-054405 ◽

2018 ◽

Vol 28 (3) ◽

pp. 346-349 ◽

Cited By ~ 1

Author(s):

Doris G Gammon ◽

Todd Rogers ◽

Ellen M Coats ◽

James M Nonnemaker ◽

Lisa Henriksen

Keyword(s):

Regional Differences ◽

Sales Volume ◽

The South ◽

Sales Data ◽

Pack Size ◽

Cigarette Taxes ◽

The Us ◽

The Usa ◽

State And Local ◽

Total Sales

ObjectiveAt least four varieties of little filtered cigars (LFCs) violate the US prohibition on flavoured cigarettes other than menthol. This study characterises the sales of prohibited products and other LFCs by flavour category and pack size, as well as the price of LFCs relative to cigarettes.MethodsUsing retail sales data for 2016, we computed the sales volume in dollars and equivalent units and the percentage of total sales by flavour and pack size for the USA by region and state. Paired t-tests compared the prices for LFCs and cigarettes sold in same-sized packs and cartons.ResultsLFC sales totalled 24 033 equivalent units per 100 000 persons in 2016. Flavoured LFC varieties accounted for almost half (47.5%) of the total sales. LFCs were sold in 12 different pack sizes, but 79.7% of sales were packs of 20. The price of 20-packs averaged $2.41 (SD=$1.49), which was significantly less than cigarettes (M=$5.90, SD=$0.85). Regional differences suggest a greater proportion of menthol/mint LFCs and lower prices in the South than in other regions.ConclusionClassifying all LFCs as cigarettes would require that they be offered in a minimum package of 20, eliminate flavoured varieties other than menthol and increase prices through applicable state and local cigarette taxes.

Download Full-text

The Austrian Semantic EO Data Cube Infrastructure

Remote Sensing ◽

10.3390/rs13234807 ◽

2021 ◽

Vol 13 (23) ◽

pp. 4807

Author(s):

Martin Sudmanns ◽

Hannah Augustin ◽

Lucas van der Meer ◽

Andrea Baraldi ◽

Dirk Tiede

Keyword(s):

Land Cover ◽

Knowledge Engineering ◽

A Priori ◽

Data Cube ◽

Color Information ◽

Data Cubes ◽

Semantic Enrichment ◽

Knowledge Based ◽

Geographical Regions ◽

Semantic Models

Big optical Earth observation (EO) data analytics usually start from numerical, sub-symbolic reflectance values that lack inherent semantic information (meaning) and require interpretation. However, interpretation is an ill-posed problem that is difficult for many users to solve. Our semantic EO data cube architecture aims to implement computer vision in EO data cubes as an explainable artificial intelligence approach. Automatic semantic enrichment provides semi-symbolic spectral categories for all observations as an initial interpretation of color information. Users graphically create knowledge-based semantic models in a convergence-of-evidence approach, where color information is modelled a-priori as one property of semantic concepts, such as land cover entities. This differs from other approaches that do not use a-priori knowledge and assume a direct 1:1 relationship between reflectance values and land cover. The semantic models are explainable, transferable, reusable, and users can share them in a knowledgebase. We provide insights into our web-based architecture, called Sen2Cube.at, including semantic enrichment, data models, knowledge engineering, semantic querying, and the graphical user interface. Our implemented prototype uses all Sentinel-2 MSI images covering Austria; however, the approach is transferable to other geographical regions and sensors. We demonstrate that explainable, knowledge-based big EO data analysis is possible via graphical semantic querying in EO data cubes.

Download Full-text

Summarizing Datacubes

Integrations of Data Warehousing, Data Mining and Database Technologies ◽

10.4018/978-1-60960-537-7.ch002 ◽

2011 ◽

pp. 19-39 ◽

Cited By ~ 1

Author(s):

Rosine Cicchetti ◽

Lotfi Lakhal ◽

Sébastien Nedjar ◽

Noël Novelli ◽

Alain Casali

Keyword(s):

Initial Data ◽

Research Work ◽

Data Cube ◽

Storage Space ◽

Analytical Evaluation ◽

Aggregated Data ◽

Analytical Comparison ◽

Data Cubes ◽

Reduction Methods ◽

Quotient Cube

Datacubes are especially useful for answering efficiently queries on data warehouses. Nevertheless the amount of generated aggregated data is huge with respect to the initial data which is itself very large. Recent research work has addressed the issue of summarizing Datacubes in order to reduce their size. In this chapter, we present three different approaches. They propose structures which make it possible to reduce the size of the data cube representation. The two former, the closed cube and the quotient cube, are said semantic and discard the redundancies captured within data cubes. The size of the underlying representations is especially reduced but the counterpart is an additional response time when answering the OLAP queries. The latter approach is rather syntactic since it enforces an optimization at the logical level. It is called Partition Cube and based on the concept of partition. We also give an algorithm to compute it. We propose a Relational Partition Cube, a novel R-Olap cubing solution for managing Partition Cubes using the relational technology. An analytical evaluation shows that the storage space of Partition Cubes is smaller than Datacubes. In order to confirm analytical comparison, experiments are performed in order to compare our approach with Datacubes and with two of the best reduction methods, the Quotient Cube and the Closed Cube.

Download Full-text

Privacy Preserving OLAP and OLAP Security

Encyclopedia of Data Warehousing and Mining, Second Edition ◽

10.4018/978-1-60566-010-3.ch241 ◽

2011 ◽

pp. 1575-1581 ◽

Cited By ~ 26

Author(s):

Alfredo Cuzzocrea ◽

Vincenzo Russo

Keyword(s):

Access Control ◽

Privacy Preservation ◽

Data Warehousing ◽

Privacy Preserving ◽

Data Cube ◽

The Other ◽

Complex Data ◽

Privacy And Security ◽

Domain Experts ◽

Data Cubes

The problem of ensuring the privacy and security of OLAP data cubes (Gray et al., 1997) arises in several fields ranging from advanced Data Warehousing (DW) and Business Intelligence (BI) systems to sophisticated Data Mining (DM) tools. In DW and BI systems, decision making analysts aim at avoiding that malicious users access perceptive ranges of multidimensional data in order to infer sensitive knowledge, or attack corporate data cubes via violating user rules, grants and revokes. In DM tools, domain experts aim at avoiding that malicious users infer critical-for-thetask knowledge from authoritative DM results such as frequent item sets, patterns and regularities, clusters, and discovered association rules. In more detail, the former application scenario (i.e., DW and BI systems) deals with both the privacy preservation and the security of data cubes, whereas the latter one (i.e., DM tools) deals with privacy preserving OLAP issues solely. With respect to security issues, although security aspects of information systems include a plethora of topics ranging from cryptography to access control and secure digital signature, in our work we particularly focus on access control techniques for data cubes, and remand the reader to the active literature for the other orthogonal matters. Specifically, privacy preservation of data cubes refers to the problem of ensuring the privacy of data cube cells (and, in turn, that of queries defined over collections of data cube cells), i.e. hiding sensitive information and knowledge during data management activities, according to the general guidelines drawn by Sweeney in her seminar paper (Sweeney, 2002), whereas access control issues refer to the problem of ensuring the security of data cube cells, i.e. restricting the access of unauthorized users to specific sub-domains of the target data cube, according to well-known concepts studied and assessed in the context of DBMS security. Nonetheless, it is quite straightforward foreseeing that these two even distinct aspects should be meaningfully integrated in order to ensure both the privacy and security of complex data cubes, i.e. data cubes built on top of complex data/knowledge bases. During last years, these topics have became of great interest for the Data Warehousing and Databases research communities, due to their exciting theoretical challenges as well as their relevance and practical impact in modern real-life OLAP systems and applications. On a more conceptual plane, theoretical aspects are mainly devoted to study how probability and statistics schemes as well as rule-based models can be applied in order to efficiently solve the above-introduced problems. On a more practical plane, researchers and practitioners aim at integrating convenient privacy preserving and security solutions within the core layers of commercial OLAP server platforms. Basically, to tackle deriving privacy preservation challenges in OLAP, researchers have proposed models and algorithms that can be roughly classified within two main classes: restriction-based techniques, and data perturbation techniques. First ones propose limiting the number of query kinds that can be posed against the target OLAP server. Second ones propose perturbing data cells by means of random noise at various levels, ranging from schemas to queries. On the other hand, access control solutions in OLAP are mainly inspired by the wide literature developed in the context of controlling accesses to DBMS, and try to adapt such schemes in order to control accesses to OLAP systems.

Download Full-text

Improved Data Partitioning for Building Large ROLAP Data Cubes in Parallel

Data Warehousing and Mining ◽

10.4018/978-1-59904-951-9.ch201 ◽

2008 ◽

pp. 3176-3193

Author(s):

Ying Chen ◽

Frank Dehne ◽

Todd Eavis ◽

A. Rau-Chaplin

Keyword(s):

High Dimensional Data ◽

Large Data ◽

Data Partitioning ◽

Data Cube ◽

High Dimensional ◽

Parallel Method ◽

Data Cubes ◽

Linear Speedup ◽

Optimal Linear ◽

Partitioning Technique

This paper presents an improved parallel method for generating ROLAP data cubes on a shared-nothing multiprocessor based on a novel optimized data partitioning technique. Since no shared disk is required, our method can be used for highly scalable processor clusters consisting of standard PCs with local disks only, connected via a data switch. Experiments show that our improved parallel method provides optimal, linear, speedup for at least 32 processors. The approach taken, which uses a ROLAP representation of the data cube, is well suited for large data warehouses and high dimensional data, and supports the generation of both fully materialized and partially materialized data cubes.

Download Full-text

Achieving the Full Vision of Earth Observation Data Cubes

Data ◽

10.3390/data4030094 ◽

2019 ◽

Vol 4 (3) ◽

pp. 94 ◽

Cited By ~ 6

Author(s):

Steve Kopp ◽

Peter Becker ◽

Abhijit Doshi ◽

Dawn J. Wright ◽

Kaixi Zhang ◽

...

Keyword(s):

Image Data ◽

Earth Observation ◽

Data Cube ◽

Observation Data ◽

Data Cubes ◽

Multiple Data ◽

Storage Location ◽

Science Community ◽

Earth Observation Data ◽

And Storage

Earth observation imagery have traditionally been expensive, difficult to find and access, and required specialized skills and software to transform imagery into actionable information. This has limited adoption by the broader science community. Changes in cost of imagery and changes in computing technology over the last decade have enabled a new approach for how to organize, analyze, and share Earth observation imagery, broadly referred to as a data cube. The vision and promise of image data cubes is to lower these hurdles and expand the user community by making analysis ready data readily accessible and providing modern approaches to more easily analyze and visualize the data, empowering a larger community of users to improve their knowledge of place and make better informed decisions. Image data cubes are large collections of temporal, multivariate datasets typically consisting of analysis ready multispectral Earth observation data. Several flavors and variations of data cubes have emerged. To simplify access for end users we developed a flexible approach supporting multiple data cube styles, referencing images in their existing structure and storage location, enabling fast access, visualization, and analysis from a wide variety of web and desktop applications. We provide here an overview of that approach and three case studies.

Download Full-text

Basic Notions on Multidimensional Aggregate Data

Encyclopedia of Information Science and Technology, First Edition ◽

10.4018/978-1-59140-553-5.ch040 ◽

2005 ◽

pp. 211-216

Author(s):

Maurizio Rafanelli

Keyword(s):

Economic Analysis ◽

Statistical Data ◽

Aggregate Data ◽

Data Cube ◽

Application Field ◽

Business Analysis ◽

On Line ◽

Analytical Processing ◽

On Line Analytical Processing ◽

New Applications

The term multidimensional aggregate data (MAD; see Rafanelli, 2003) generally refers to data in which a given fact is quantified by a set of measures obtained applying one more or less complex aggregative function (count, sum, average, percent, etc.) to row data, measures that are characterized by a set of variables, called dimensions. MAD can be modeled by different representations, depending on the application field which uses them. For example, some years ago this term referred essentially to statistical data, that is, data whose use is essentially of socio-economic analysis. Recently, the metaphor of the data cube was taken up again and used for new applications, such as On-Line Analytical Processing (OLAP), which refer to aggregate and non aggregate data for business analysis.

Download Full-text