Modeling XML Warehouses for Complex Data

While the classical databases aimed in data managing within enterprises, data warehouses help them to analyze data in order to drive their activities (Inmon, 2005). The data warehouses have proven their usefulness in the decision making process by presenting valuable data to the user and allowing him/her to analyze them online (Rafanelli, 2003). Current data warehouse and OLAP tools deal, for their most part, with numerical data which is structured usually using the relational model. Therefore, considerable amounts of unstructured or semi-structured data are left unexploited. We qualify such data as “complex data” because they originate in different sources; have multiple forms, and have complex relationships amongst them. Warehousing and exploiting such data raise many issues. In particular, modeling a complex data warehouse using the traditional star schema is no longer adequate because of many reasons (Boussaïd, Ben Messaoud, Choquet, & Anthoard, 2006; Ravat, Teste, Tournier, & Zurfluh, 2007b). First, the complex structure of data needs to be preserved rather than to be structured linearly as a set of attributes. Secondly, we need to preserve and exploit the relationships that exist between data when performing the analysis. Finally, a need may occur to operate new aggregation modes (Ben Messaoud, Boussaïd, & Loudcher, 2006; Ravat, Teste, Tournier, & Zurfluh, 2007a) that are based on textual rather than on numerical data. The design and modeling of decision support systems based on complex data is a very exciting scientific challenge (Pedersen & Jensen, 1999; Jones & Song, 2005; Luján-Mora, Trujillo, & Song; 2006). Particularly, modeling a complex data warehouse at the conceptual level then at a logical level are not straightforward activities. Little work has been done regarding these activities. At the conceptual level, most of the proposed models are object-oriented (Ravat et al, 2007a; Nassis, Rajugan, Dillon, & Rahayu 2004) and some of them make use of UML as a notation language. At the logical level, XML has been used in many models because of its adequacy for modeling both structured and semi structured data (Pokorný, 2001; Baril & Bellahsène, 2003; Boussaïd et al., 2006). In this chapter, we propose an approach of multidimensional modeling of complex data at both the conceptual and logical levels. Our conceptual model answers some modeling requirements that we believe not fulfilled by the current models. These modeling requirements are exemplified by the Digital Bibliography & Library Project case study (DBLP).

Download Full-text

Humanitites Data Warehousing

Data Warehousing and Mining ◽

10.4018/978-1-59904-951-9.ch141 ◽

2008 ◽

pp. 2364-2370

Author(s):

Janet Delve

Keyword(s):

Data Warehouse ◽

Relational Databases ◽

Data Warehousing ◽

Numerical Data ◽

Complex Nature ◽

Data Warehouses ◽

Textual Data ◽

Numeric Data ◽

First Time ◽

And Linguistics

Data Warehousing is now a well-established part of the business and scientific worlds. However, up until recently, data warehouses were restricted to modeling essentially numerical data – examples being sales figures in the business arena (e.g. Wal-Mart’s data warehouse) and astronomical data (e.g. SKICAT) in scientific research, with textual data providing a descriptive rather than a central role. The lack of ability of data warehouses to cope with mainly non-numeric data is particularly problematic for humanities1 research utilizing material such as memoirs and trade directories. Recent innovations have opened up possibilities for non-numeric data warehouses, making them widely accessible to humanities research for the first time. Due to its irregular and complex nature, humanities research data is often difficult to model and manipulating time shifts in a relational database is problematic as is fitting such data into a normalized data model. History and linguistics are exemplars of areas where relational databases are cumbersome and which would benefit from the greater freedom afforded by data warehouse dimensional modeling.

Download Full-text

Humanities Data Warehousing

Encyclopedia of Data Warehousing and Mining, Second Edition ◽

10.4018/978-1-60566-010-3.ch153 ◽

2011 ◽

pp. 987-992

Author(s):

Janet Delve

Keyword(s):

Data Warehouse ◽

Relational Databases ◽

Data Warehousing ◽

Numerical Data ◽

Complex Nature ◽

Data Warehouses ◽

Textual Data ◽

Numeric Data ◽

First Time ◽

And Linguistics

Data Warehousing is now a well-established part of the business and scientific worlds. However, up until recently, data warehouses were restricted to modeling essentially numerical data – examples being sales figures in the business arena (in say Wal-Mart’s data warehouse (Westerman, 2000)) and astronomical data (for example SKICAT) in scientific research, with textual data providing a descriptive rather than a central analytic role. The lack of ability of data warehouses to cope with mainly non-numeric data is particularly problematic for humanities1 research utilizing material such as memoirs and trade directories. Recent innovations have opened up possibilities for ‘non-numeric’ data warehouses, making them widely accessible to humanities research for the first time. Due to its irregular and complex nature, humanities research data is often difficult to model, and manipulating time shifts in a relational database is problematic as is fitting such data into a normalized data model. History and linguistics are exemplars of areas where relational databases are cumbersome and which would benefit from the greater freedom afforded by data warehouse dimensional modeling.

Download Full-text

Biomedical Data Warehouses

Encyclopedia of Healthcare Information Systems ◽

10.4018/978-1-59904-889-5.ch021 ◽

2008 ◽

pp. 149-156 ◽

Cited By ~ 1

Author(s):

Jérôme Darmont ◽

Emerson Olivier

Keyword(s):

Data Warehouse ◽

Data Warehousing ◽

Large Data ◽

Complex Data ◽

Biomedical Data ◽

Future Trends ◽

Data Warehouses ◽

On Line ◽

Line Analysis ◽

Existing Data

In this context, the warehouse measures, though not necessarily numerical, remain the indicators for analysis, and analysis is still performed following different perspectives represented by dimensions. Large data volumes and their dating are other arguments in favor of this approach (Darmont et al., 2003). Data warehousing can also support various types of analysis, such as statistical reporting, on-line analysis (OLAP) and data mining. The aim of this article is to present an overview of the existing data warehouses for biomedical data and to discuss the issues and future trends in biomedical data warehousing. We illustrate this topic by presenting the design of an innovative, complex data warehouse for personal, anticipative medicine.

Download Full-text

A Proposed DDS Enabled Model for Data Warehouses with Real Time Updates

International Journal of Informatics and Communication Technology (IJ-ICT) ◽

10.11591/ijict.v7i1.pp31-38 ◽

2018 ◽

Vol 7 (1) ◽

pp. 31

Author(s):

Munesh Chandra Trivedi ◽

Virendra Kumar Yadav ◽

Avadhesh Kumar Gupta

Keyword(s):

Real Time ◽

Data Warehouse ◽

User Acceptance ◽

Current Data ◽

Decision Makers ◽

Data Sources ◽

Design Phase ◽

Data Warehouses ◽

Inconsistent Data ◽

Proposed Model

<p>Data warehouse generally contains both types of data i.e. historical & current data from various data sources. Data warehouse in world of computing can be defined as system created for analysis and reporting of these both types of data. These analysis report is then used by an organization to make decisions which helps them in their growth. Construction of data warehouse appears to be simple, collection of data from data sources into one place (after extraction, transform and loading). But construction involves several issues such as inconsistent data, logic conflicts, user acceptance, cost, quality, security, stake holder’s contradictions, REST alignment etc. These issues need to be overcome otherwise will lead to unfortunate consequences affecting the organization growth. Proposed model tries to solve these issues such as REST alignment, stake holder’s contradiction etc. by involving experts of various domains such as technical, analytical, decision makers, management representatives etc. during initialization phase to better understand the requirements and mapping these requirements to data sources during design phase of data warehouse.</p>

Download Full-text

Humanitites Data Warehousing

Encyclopedia of Data Warehousing and Mining ◽

10.4018/978-1-59140-557-3.ch108 ◽

2011 ◽

pp. 570-574

Author(s):

Janet Delve

Keyword(s):

Data Warehouse ◽

Relational Databases ◽

Data Warehousing ◽

Numerical Data ◽

Complex Nature ◽

Data Warehouses ◽

Textual Data ◽

Numeric Data ◽

First Time ◽

And Linguistics

Data Warehousing is now a well-established part of the business and scientific worlds. However, up until recently, data warehouses were restricted to modeling essentially numerical data – examples being sales figures in the business arena (e.g. Wal-Mart’s data warehouse) and astronomical data (e.g. SKICAT) in scientific research, with textual data providing a descriptive rather than a central role. The lack of ability of data warehouses to cope with mainly non-numeric data is particularly problematic for humanities1 research utilizing material such as memoirs and trade directories. Recent innovations have opened up possibilities for non-numeric data warehouses, making them widely accessible to humanities research for the first time. Due to its irregular and complex nature, humanities research data is often difficult to model and manipulating time shifts in a relational database is problematic as is fitting such data into a normalized data model. History and linguistics are exemplars of areas where relational databases are cumbersome and which would benefit from the greater freedom afforded by data warehouse dimensional modeling.

Download Full-text

Harvesting Information from a Library Data Warehouse

Information Technology and Libraries ◽

10.6017/ital.v19i1.10070 ◽

2017 ◽

Vol 19 (1) ◽

pp. 17-28 ◽

Cited By ~ 2

Author(s):

Siew-Phek T. Su ◽

Ashwin Needamangala

Keyword(s):

Decision Support ◽

Knowledge Discovery ◽

Data Warehouse ◽

Data Warehousing ◽

End Users ◽

Cutting Edge ◽

Data Warehouses ◽

Student Data ◽

Integrated Platform ◽

Academic Sector

Data warehousing technology has been defined by John Ladley as "a set of methods, techniques, and tools that are leveraged together and used to produce a vehicle that delivers data to end users on an integrated platform." (1) This concept h s been applied increasingly by industries worldwide to develop data warehouses for decision support and knowledge discovery. In the academic sector, several universities have developed data warehouses containing the universities' financial, payroll, personnel, budget, and student data. (2) These data warehouses across all industries and academia have met with varying degrees of success. Data warehousing technology and its related issues have been widely discussed and published. (3) Little has been done, however, on the application of this cutting edge technology in the library environment using library data.

Download Full-text

Data Warehouse Design to Support Customer Relationship Management Analyses

Database Technologies ◽

10.4018/978-1-60566-058-5.ch042 ◽

2009 ◽

pp. 702-724

Author(s):

Colleen Cunningham ◽

Il-Yeol Song ◽

Peter P. Chen

Keyword(s):

Data Warehouse ◽

Data Warehousing ◽

Customer Relationship ◽

Data Warehouses ◽

Design Decisions ◽

Long Term Study ◽

Percent Success ◽

The Impact

CRM is a strategy that integrates concepts of knowledge management, data mining, and data warehousing in order to support an organization’s decision-making process to retain long-term and profitable relationships with its customers. This research is part of a long-term study to examine systematically CRM factors that affect design decisions for CRM data warehouses in order to build a taxonomy of CRM analyses and to determine the impact of those analyses on CRM data warehousing design decisions. This article presents the design implications that CRM poses to data warehousing and then proposes a robust multidimensional starter model that supports CRM analyses. Additional research contributions include the introduction of two new measures, percent success ratio and CRM suitability ratio by which CRM models can be evaluated, the identification of and classification of CRM queries, and a preliminary heuristic for designing data warehouses to support CRM analyses.

Download Full-text

Data Warehouse Design to Support Customer Relationship Management Analysis

Strategic Information Systems ◽

10.4018/978-1-60566-677-8.ch048 ◽

2011 ◽

pp. 731-752

Author(s):

Colleen Cunningham ◽

Il-Yeol Song ◽

Peter P. Chen

Keyword(s):

Data Warehouse ◽

Data Warehousing ◽

Customer Relationship ◽

Data Warehouses ◽

Design Decisions ◽

Long Term Study ◽

Percent Success ◽

The Impact

CRM is a strategy that integrates concepts of knowledge management, data mining, and data warehousing in order to support an organization’s decision-making process to retain long-term and profitable relationships with its customers. This research is part of a long-term study to examine systematically CRM factors that affect design decisions for CRM data warehouses in order to build a taxonomy of CRM analyses and to determine the impact of those analyses on CRM data warehousing design decisions. This article presents the design implications that CRM poses to data warehousing and then proposes a robust multidimensional starter model that supports CRM analyses. Additional research contributions include the introduction of two new measures, percent success ratio and CRM suitability ratio by which CRM models can be evaluated, the identification of and classification of CRM queries, and a preliminary heuristic for designing data warehouses to support CRM analyses.

Download Full-text

Data Warehousing

Data Warehousing and Mining ◽

10.4018/978-1-59904-951-9.ch171 ◽

2008 ◽

pp. 2749-2761

Author(s):

Hugh J. Watson ◽

Barbara H. Wixom ◽

Dale L. Goodhue

Keyword(s):

Decision Support ◽

Real Time ◽

Data Warehouse ◽

Data Warehousing ◽

Top Management ◽

Data Warehouses ◽

Web Based ◽

Good Data ◽

Customer Services ◽

Enterprise Data Warehouse

Data warehouses are helping resolve a major problem that has plagued decision support applications over the years — a lack of good data. Top management at 3M realized that the company had to move from being product-centric to being customer savvy. In response, 3M built a terabyte data warehouse (global enterprise data warehouse) that provides thousands of 3M employees with real-time access to accurate, global, detailed information. The data warehouse underlies new Web-based customer services that are dynamically generated based on warehouse information. There are useful lessons that were learned at 3M during their years of developing the data warehouse.

Download Full-text