Online Analytical Processing and Data-Cube Technologies

Author(s):  
Lixin Fu ◽  
Wen-Chen Hu

Since the late ’80s and early ’90s, database technologies have evolved to a new level of applications: online analytical processing (OLAP), where executive management can make quick and effective strategic decisions based on knowledge in terms of queries against large amounts of stored data. Some OLAP systems are also regarded as decision support systems (DSSs) or executive information systems (EIS). The traditional, well-established online transactional processing (OLTP) systems such as relational database management systems (RDBMS) mainly deal with mission-critical daily transactions. Typically, there are a large number of short, simple queries such as lookups, insertions, and deletions. The main focus is transaction throughput, consistency, concurrency, and failure recovery issues. OLAP systems, on the other hand, are mainly analytical and informational. OLAP systems are usually closely coupled with data warehouses, which can contain very large data sets that may include historical data as well as data integrated from different departments and geographical locations. So the sizes of data warehouses are usually significantly larger than common OLTP systems. In addition, the workloads of OLAP are quite different from those of traditional transaction systems: The queries are unpredictable and much more complicated. For example, an OLAP query could be, “For each type of car and each manufacturer, list market share change in terms of car sales between the first quarter of 2005 and the first quarter of 2006.” The purpose of these queries is not for the daily operational maintenance of data; instead, it is for deeper knowledge from data used for decision support.

Web Mining ◽  
2011 ◽  
pp. 189-207
Author(s):  
Lixin Fu

Currently, data classification is either performed on data stored in relational databases or performed on data stored in flat files. The problem with these approaches is that for large data sets, they often need multiple scans of the original data and thus are often infeasible in many applications. In this chapter we propose to deploy classification on top of OLAP (online analytical processing) and data cube systems. First, we compute the statistics in various combinations of the attributes known as data cubes. The statistics are then used to derive classification models. In this way, we only scan the original data once, which improves the performance of classification significantly. Furthermore, our new classifier will provide “free” classification by eliminating the dominating I/O overhead of scanning the massive original data. An architecture that integrates database, data cube, and data mining is given and three new cube-based classifiers are presented and evaluated.


Author(s):  
Menaceur Sadek ◽  
Makhlouf Derdour ◽  
Bouramoul Abdelkrim

This article is part of the field of analysis and personalization of large data sets (Big Data). This aspect of analysis and customization has become a major issue that has generated a lot of questions in recent years. Indeed, it is difficult for inexperienced or casual users to extract relevant information in a Big Data context, for volume, the velocity and the variability of data make it difficult for the user to capture, manage and process data by methods and traditional tools. In this article, the authors propose a new approach for personalizing OLAP analysis in a Big Data context by using context and user profile. The proposed approach is based on five complementary layers namely: Extern layer, layer for the formulation of the contexts defined in the system, profiling and querying layer and layer for the construction of personalized OLAP cubes and a final one for multidimensional analysis cubes. The conducted experiment has shown that taking context and user profile into account improves the results of online analytical processing in the context of Big Data.


2021 ◽  
Vol 17 (4) ◽  
pp. 1-28
Author(s):  
Waqas Ahmed ◽  
Esteban Zimányi ◽  
Alejandro A. Vaisman ◽  
Robert Wrembel

Data warehouses (DWs) evolve in both their content and schema due to changes of user requirements, business processes, or external sources to name a few. Although multiple approaches using temporal and/or multiversion DWs have been proposed to handle these changes, an efficient solution for this problem is still lacking. The authors' approach is to separate concerns and use temporal DWs to deal with content changes, and multiversion DWs to deal with schema changes. To address the former, previously, they have proposed a temporal multidimensional (MD) model. In this paper, they propose a multiversion MD model for schema evolution to tackle the latter problem. The two models complement each other and allow managing both content and schema evolution. In this paper, the semantics of schema modification operators (SMOs) to derive various schema versions are given. It is also shown how online analytical processing (OLAP) operations like roll-up work on the model. Finally, the mapping from the multiversion MD model to a relational schema is given along with OLAP operations in standard SQL.


Author(s):  
Johann Eder ◽  
Karl Wiggisser

Data Warehouses typically are building blocks of decision support systems in companies and public administration. The data contained in a data warehouse is analyzed by means of OnLine Analytical Processing tools, which provide sophisticated features for aggregating and comparing data. Decision support applications depend on the reliability and accuracy of the contained data. Typically, a data warehouse does not only comprise the current snapshot data but also historical data to enable, for instance, analysis over several years. And, as we live in a changing world, one criterion for the reliability and accuracy of the results of such long period queries is their comparability. Whereas data warehouse systems are well prepared for changes in the transactional data, they are, surprisingly, not able to deal with changes in the master data. Nonetheless, such changes do frequently occur. The crucial point for supporting changes is, first of all, being aware of their existence. Second, once you know that a change took place, it is important to know which change (i.e., knowing about differences between versions and relations between the elements of different versions). For data warehouses this means that changes are identified and represented, validity of data and structures are recorded and this knowledge is used for computing correct results for OLAP queries. This chapter is intended to motivate the need for powerful maintenance mechanisms for data warehouse cubes. It presents some basic terms and definitions for the common understanding and introduces the different aspects of data warehouse maintenance. Furthermore, several approaches addressing the problem are presented and classified by their capabilities.


2011 ◽  
pp. 141-156
Author(s):  
Rahul Singh ◽  
Richard T. Redmond ◽  
Victoria Yoon

Intelligent decision support requires flexible, knowledge-driven analysis of data to solve complex decision problems faced by contemporary decision makers. Recently, online analytical processing (OLAP) and data mining have received much attention from researchers and practitioner alike, as components of an intelligent decision support environment. Little that has been done in developing models to integrate the capabilities of data mining and online analytical processing to provide a systematic model for intelligent decision making that allows users to examine multiple views of the data that are generated using knowledge about the environment and the decision problem domain. This paper presents an integrated model in which data mining and online analytical processing complement each other to support intelligent decision making for data rich environments. The integrated approach models system behaviors that are of interest to decision makers; predicts the occurrence of such behaviors; provides support to explain the occurrence of such behaviors and supports decision making to identify a course of action to manage these behaviors.


2003 ◽  
pp. 200-221 ◽  
Author(s):  
Mirek Riedewald ◽  
Divyakant Agrawal ◽  
Amr El Abbadi

Data cubes are ubiquitous tools in data warehousing, online analytical processing, and decision support applications. Based on a selection of pre-computed and materialized aggregate values, they can dramatically speed up aggregation and summarization over large data collections. Traditionally, the emphasis has been on lowering query costs with little regard to maintenance, i.e., update cost issues. We argue that current trends require data cubes to be not only query-efficient, but also dynamic at the same time, and we also show how this can be achieved. Several array-based techniques with different tradeoffs between query and update cost are discussed in detail. We also survey selected approaches for sparse data and the popular data cube operator, CUBE. Moreover, this work includes an overview of future trends and their impact on data cubes.


Author(s):  
Chandra S. Amaravadi

In the past decade, a new and exciting technology has unfolded on the shores of the information systems area. Based on a combination of statistical and artificial intelligence techniques, data mining has emerged from relational databases and Online Analytical Processing as a powerful tool for organizational decision support (Shim et al., 2002).


Author(s):  
José María Cavero Barca ◽  
Esperanza Marcos Martinez ◽  
Mario G. Piattini ◽  
Adolfo Sánchez de Miguel

The concept of data warehouse first appeared in Inmon (1993) to describe a “subject oriented, integrated, non-volatile, and time variant collection of data in support of management’s decisions” (31). It is a concept related to the OLAP (online analytical processing) technology, first introduced by Codd et al. (1993) to characterize the requirements of aggregation, consolidation, view production, formulae application, and data synthesis in many dimensions. A data warehouse is a repository of information that mainly comes from online transactional processing (OLTP) systems that provide data for analytical processing and decision support.


Sign in / Sign up

Export Citation Format

Share Document