PARALLEL CONSTRUCTION OF DATA CUBES ON MULTI-CORE MULTI-DISK PLATFORMS

2013 ◽  
Vol 23 (01) ◽  
pp. 1350002
Author(s):  
FRANK DEHNE ◽  
HAMIDREZA ZABOLI

On-line Analytical Processing (OLAP) has become one of the most powerful and prominent technologies for knowledge discovery in VLDB (Very Large Database) environments. Central to the OLAP paradigm is the data cube, a multi dimensional hierarchy of aggregate values that provides a rich analytical model for decision support. Various sequential algorithms for the efficient generation of the data cube have appeared in the literature. However, given the size of contemporary data warehousing repositories, multi-processor solutions are crucial for the massive computational demands of current and future OLAP systems. In this paper we discuss the development of MCMD-CUBE, a new parallel data cube construction method for multi-core processors with multiple disks. We present experimental results for a Sandy Bridge multi-core processor with four parallel disks. Our experiments indicate that MCMD-CUBE achieves very close to linear speedup. A critical part of our MCMD-CUBE method is parallel sorting. We developed a new parallel sorting method termed MCMD-SORT for multi-core processors with multiple disks which outperforms other previous methods.

2012 ◽  
Vol 4 (2) ◽  
pp. 32-45
Author(s):  
Dong Jin ◽  
Tatsuo Tsuji

The pre-computation of data cubes is critical for improving the response time of OLAP (On-Line Analytical Processing) systems. To meet the need for improved performance created by growing data sizes, parallel solutions for data cube construction are becoming increasingly important. This paper presents a new parallel data cube construction scheme based on an extendible multidimensional array, which is dynamically extendible along any dimension without relocating any existing data. The authors have implemented and evaluated their parallel data cube construction methods on shared-memory multiprocessors. Given the performance limit, the methods achieve close to linear speedup with load balance. The authors’ experiments also indicate that their parallel methods can be more scalable on higher dimensional data cube construction.


Author(s):  
Maurizio Rafanelli

The term multidimensional aggregate data (MAD; see Rafanelli, 2003) generally refers to data in which a given fact is quantified by a set of measures obtained applying one more or less complex aggregative function (count, sum, average, percent, etc.) to row data, measures that are characterized by a set of variables, called dimensions. MAD can be modeled by different representations, depending on the application field which uses them. For example, some years ago this term referred essentially to statistical data, that is, data whose use is essentially of socio-economic analysis. Recently, the metaphor of the data cube was taken up again and used for new applications, such as On-Line Analytical Processing (OLAP), which refer to aggregate and non aggregate data for business analysis.


Author(s):  
Jan H. Kroeze

This chapter discusses the application of some data warehousing techniques on a data cube of linguistic data. The results of various modules of clausal analysis can be stored in a three-dimensional data cube in order to facilitate on-line analytical processing of data by means of three-dimensional arrays. Slicing is such an analytical technique, which reveals various dimensions of data and their relationships to other dimensions. By using this data warehousing facility the clause cube can be viewed or manipulated to reveal, for example, phrases and clauses, syntactic structures, semantic role frames, or a two-dimensional representation of a particular clause’s multi-dimensional analysis in table format. These functionalities are illustrated by means of the Hebrew text of Genesis 1:1-2:3. The authors trust that this chapter will contribute towards efficient storage and advanced processing of linguistic data.


Author(s):  
Harkiran Kaur ◽  
Kawaljeet Singh ◽  
Tejinder Kaur

Background: Numerous E – Migrants databases assist the migrants to locate their peers in various countries; hence contributing largely in communication of migrants, staying overseas. Presently, these traditional E – Migrants databases face the issues of non – scalability, difficult search mechanisms and burdensome information update routines. Furthermore, analysis of migrants’ profiles in these databases has remained unhandled till date and hence do not generate any knowledge. Objective: To design and develop an efficient and multidimensional knowledge discovery framework for E - Migrants databases. Method: In the proposed technique, results of complex calculations related to most probable On-Line Analytical Processing operations required by end users, are stored in the form of Decision Trees, at the pre- processing stage of data analysis. While browsing the Cube, these pre-computed results are called; thus offering Dynamic Cubing feature to end users at runtime. This data-tuning step reduces the query processing time and increases efficiency of required data warehouse operations. Results: Experiments conducted with Data Warehouse of around 1000 migrants’ profiles confirm the knowledge discovery power of this proposal. Using the proposed methodology, authors have designed a framework efficient enough to incorporate the amendments made in the E – Migrants Data Warehouse systems on regular intervals, which was totally missing in the traditional E – Migrants databases. Conclusion: The proposed methodology facilitate migrants to generate dynamic knowledge and visualize it in the form of dynamic cubes. Applying Business Intelligence mechanisms, blending it with tuned OLAP operations, the authors have managed to transform traditional datasets into intelligent migrants Data Warehouse.


2013 ◽  
Vol 765-767 ◽  
pp. 2590-2594
Author(s):  
Qian Jin Wang

Multi-core processor has been a hot topic since it improves operation speed. It is not easy to get efficient parallel processing data algorithms because of waste of hardware resources. In this paper, a novel multitask parallel algorithm based on getting common substring of two strings is described in order to improve the data-handling capacity of the multi-processor. Firstly, this algorithm performs Task Parallel Library (TPL) in VS.NET, and then schedule the algorithm proposed in this paper to process data. This algorithm is tested by actual parallel data. The results demonstrate that this algorithm overcomes the problem of waste of hardware resource, can take full advantage of the features of multi-core parallel processing data thereby enhancing the parallel speedup, greatly improving the efficiency of data processing.


Author(s):  
Anastasia Y. Nikitaeva

This chapter substantiates the importance of improving management effectiveness of mesoeconomic systems in current economic conditions and the features of mesoeconomy as a management object which defines the high complexity of decision making at the meso level. There are approaches, methods, and technologies which provide support of the decision making process via the integration of formal methods for objective data analysis and methods of accounting to solve semi-structured complex problems of mesoeconomy. A cognitive approach, and an approach involving the integration of the On-Line Analytical Processing and Data mining technologies with methods of a multi-criteria assessment of alternative, in particular methods of Multi-Attribute Utility Theory are considered in the chapter. Cognitive mapping of interaction between state and business in a mesoeconomic system are included as a case-study.


Sign in / Sign up

Export Citation Format

Share Document