Multidimensional Modeling of Complex Data

Author(s):  
Omar Boussaid ◽  
Doulkifli Boukraa

While the classical databases aimed in data managing within enterprises, data warehouses help them to analyze data in order to drive their activities (Inmon, 2005). The data warehouses have proven their usefulness in the decision making process by presenting valuable data to the user and allowing him/her to analyze them online (Rafanelli, 2003). Current data warehouse and OLAP tools deal, for their most part, with numerical data which is structured usually using the relational model. Therefore, considerable amounts of unstructured or semi-structured data are left unexploited. We qualify such data as “complex data” because they originate in different sources; have multiple forms, and have complex relationships amongst them. Warehousing and exploiting such data raise many issues. In particular, modeling a complex data warehouse using the traditional star schema is no longer adequate because of many reasons (Boussaïd, Ben Messaoud, Choquet, & Anthoard, 2006; Ravat, Teste, Tournier, & Zurfluh, 2007b). First, the complex structure of data needs to be preserved rather than to be structured linearly as a set of attributes. Secondly, we need to preserve and exploit the relationships that exist between data when performing the analysis. Finally, a need may occur to operate new aggregation modes (Ben Messaoud, Boussaïd, & Loudcher, 2006; Ravat, Teste, Tournier, & Zurfluh, 2007a) that are based on textual rather than on numerical data. The design and modeling of decision support systems based on complex data is a very exciting scientific challenge (Pedersen & Jensen, 1999; Jones & Song, 2005; Luján-Mora, Trujillo, & Song; 2006). Particularly, modeling a complex data warehouse at the conceptual level then at a logical level are not straightforward activities. Little work has been done regarding these activities. At the conceptual level, most of the proposed models are object-oriented (Ravat et al, 2007a; Nassis, Rajugan, Dillon, & Rahayu 2004) and some of them make use of UML as a notation language. At the logical level, XML has been used in many models because of its adequacy for modeling both structured and semi structured data (Pokorný, 2001; Baril & Bellahsène, 2003; Boussaïd et al., 2006). In this chapter, we propose an approach of multidimensional modeling of complex data at both the conceptual and logical levels. Our conceptual model answers some modeling requirements that we believe not fulfilled by the current models. These modeling requirements are exemplified by the Digital Bibliography & Library Project case study (DBLP).

Author(s):  
Doulkifli Boukraa ◽  
Riadh Ben Messaoud ◽  
Omar Boussaid

Current data warehouses deal for the most part with numerical data. However, decision makers need to analyze data presented in all formats which one can qualify as complex data. Warehousing complex data is a new challenge for the scientific community. Indeed, it requires revisiting the whole warehousing process in order to take into account the complex structure of data; therefore, many concepts of data warehousing will need to be redefined. In particular, modeling complex data in a unique format for analysis purposes is a challenge. In this chapter, the authors present a complex data warehouse model at both conceptual and logical levels. They show how XML is suitable for capturing the main concepts of their model, and present the main issues related to these data warehouses.


2008 ◽  
pp. 2364-2370
Author(s):  
Janet Delve

Data Warehousing is now a well-established part of the business and scientific worlds. However, up until recently, data warehouses were restricted to modeling essentially numerical data – examples being sales figures in the business arena (e.g. Wal-Mart’s data warehouse) and astronomical data (e.g. SKICAT) in scientific research, with textual data providing a descriptive rather than a central role. The lack of ability of data warehouses to cope with mainly non-numeric data is particularly problematic for humanities1 research utilizing material such as memoirs and trade directories. Recent innovations have opened up possibilities for non-numeric data warehouses, making them widely accessible to humanities research for the first time. Due to its irregular and complex nature, humanities research data is often difficult to model and manipulating time shifts in a relational database is problematic as is fitting such data into a normalized data model. History and linguistics are exemplars of areas where relational databases are cumbersome and which would benefit from the greater freedom afforded by data warehouse dimensional modeling.


Author(s):  
Janet Delve

Data Warehousing is now a well-established part of the business and scientific worlds. However, up until recently, data warehouses were restricted to modeling essentially numerical data – examples being sales figures in the business arena (in say Wal-Mart’s data warehouse (Westerman, 2000)) and astronomical data (for example SKICAT) in scientific research, with textual data providing a descriptive rather than a central analytic role. The lack of ability of data warehouses to cope with mainly non-numeric data is particularly problematic for humanities1 research utilizing material such as memoirs and trade directories. Recent innovations have opened up possibilities for ‘non-numeric’ data warehouses, making them widely accessible to humanities research for the first time. Due to its irregular and complex nature, humanities research data is often difficult to model, and manipulating time shifts in a relational database is problematic as is fitting such data into a normalized data model. History and linguistics are exemplars of areas where relational databases are cumbersome and which would benefit from the greater freedom afforded by data warehouse dimensional modeling.


Author(s):  
Jérôme Darmont ◽  
Emerson Olivier

In this context, the warehouse measures, though not necessarily numerical, remain the indicators for analysis, and analysis is still performed following different perspectives represented by dimensions. Large data volumes and their dating are other arguments in favor of this approach (Darmont et al., 2003). Data warehousing can also support various types of analysis, such as statistical reporting, on-line analysis (OLAP) and data mining. The aim of this article is to present an overview of the existing data warehouses for biomedical data and to discuss the issues and future trends in biomedical data warehousing. We illustrate this topic by presenting the design of an innovative, complex data warehouse for personal, anticipative medicine.


Data Mining ◽  
2013 ◽  
pp. 1422-1448
Author(s):  
Fadila Bentayeb ◽  
Nora Maïz ◽  
Hadj Mahboubi ◽  
Cécile Favre ◽  
Sabine Loudcher ◽  
...  

Research in data warehousing and OLAP has produced important technologies for the design, management, and use of Information Systems for decision support. With the development of Internet, the availability of various types of data has increased. Thus, users require applications to help them obtaining knowledge from the Web. One possible solution to facilitate this task is to extract information from the Web, transform and load it to a Web Warehouse, which provides uniform access methods for automatic processing of the data. In this chapter, we present three innovative researches recently introduced to extend the capabilities of decision support systems, namely (1) the use of XML as a logical and physical model for complex data warehouses, (2) associating data mining to OLAP to allow elaborated analysis tasks for complex data and (3) schema evolution in complex data warehouses for personalized analyses. Our contributions cover the main phases of the data warehouse design process: data integration and modeling, and user driven-OLAP analysis.


Author(s):  
Xuegang Huang

The wide adoption of business intelligence applications has let more and more organizations to build and maintain data warehouse systems. Concepts like “unified view of data” and “one version of the truth” have been the main drive of creating data warehouses. The dynamics of the business world poses the challenges of managing large volume, complex data in data warehouses while the real-time integration and master data needs are presented. This chapter summarizes the past and present patterns of typical data warehouse architectures and describes how the concept of service-oriented architecture influences the future evolvement of data warehouse architecture. The discussion takes many real world requirements in data warehouse solutions and lists considerations on how architecture patterns can solve these requirements.


Author(s):  
Hadj Mahboubi ◽  
Thierry Faure ◽  
Sandro Bimonte ◽  
Guillaume Deffuant ◽  
Jean-Pierre Chanet ◽  
...  

This paper examines the multidimensional modeling of a data warehouse for simulation results. Environmental dynamics modeling is used to study complex scenarios like urbanization, climate change and deforestation while allowing decision makers to understand and predict the evolution of the environment in response to potential value changes in a large number of influence variables. In this context, exploring simulation models produces a huge volume of data, which must often be studied extensively at different levels of aggregation due to there being a great need to define tools and methodologies specifically adapted for the storage and analysis of such complex data. Data warehousing systems provide technologies for managing simulation results from different sources. Moreover, OLAP technologies allow one to analyze and compare these results and their corresponding models. In this paper, the authors propose a generic multidimensional schema to analyze the results of a simulation model, which can guide modelers in designing specific data warehouses, and an adaptation of an OLAP client tool to provide an adequate visualization of data. As an example, a data warehouse for the analysis of results produced from a savanna simulation model is implemented using a Relational OLAP architecture.


Author(s):  
Munesh Chandra Trivedi ◽  
Virendra Kumar Yadav ◽  
Avadhesh Kumar Gupta

<p>Data warehouse generally contains both types of data i.e. historical &amp; current data from various data sources. Data warehouse in world of computing can be defined as system created for analysis and reporting of these both types of data. These analysis report is then used by an organization to make decisions which helps them in their growth. Construction of data warehouse appears to be simple, collection of data from data sources into one place (after extraction, transform and loading). But construction involves several issues such as inconsistent data, logic conflicts, user acceptance, cost, quality, security, stake holder’s contradictions, REST alignment etc. These issues need to be overcome otherwise will lead to unfortunate consequences affecting the organization growth. Proposed model tries to solve these issues such as REST alignment, stake holder’s contradiction etc. by involving experts of various domains such as technical, analytical, decision makers, management representatives etc. during initialization phase to better understand the requirements and mapping these requirements to data sources during design phase of data warehouse.</p>


Author(s):  
Hadj Mahboubi ◽  
Jérôme Darmont

XML data warehouses form an interesting basis for decision-support applications that exploit complex data. However, native-XML database management systems (DBMSs) currently bear limited performances and it is necessary to research for ways to optimize them. In this chapter, the authors present two such techniques. First, they propose an XML join index that is specifically adapted to the multidimensional architecture of XML warehouses. It eliminates join operations while preserving the information contained in the original warehouse. Second, the authors present a strategy for selecting XML materialized views by clustering the query workload. To validate these proposals, the authors measure the response time of a set of decision-support XQueries over an XML data warehouse, with and without using their optimization techniques. The authors’ experimental results demonstrate their efficiency, even when queries are complex and data are voluminous.


Author(s):  
Janet Delve

Data Warehousing is now a well-established part of the business and scientific worlds. However, up until recently, data warehouses were restricted to modeling essentially numerical data – examples being sales figures in the business arena (e.g. Wal-Mart’s data warehouse) and astronomical data (e.g. SKICAT) in scientific research, with textual data providing a descriptive rather than a central role. The lack of ability of data warehouses to cope with mainly non-numeric data is particularly problematic for humanities1 research utilizing material such as memoirs and trade directories. Recent innovations have opened up possibilities for non-numeric data warehouses, making them widely accessible to humanities research for the first time. Due to its irregular and complex nature, humanities research data is often difficult to model and manipulating time shifts in a relational database is problematic as is fitting such data into a normalized data model. History and linguistics are exemplars of areas where relational databases are cumbersome and which would benefit from the greater freedom afforded by data warehouse dimensional modeling.


Sign in / Sign up

Export Citation Format

Share Document