Temporal Data Warehousing

Author(s):  
Matteo Golfarelli ◽  
Stefano Rizzi

Data warehouses are information repositories specialized in supporting decision making. Since the decisional process typically requires an analysis of historical trends, time and its management acquire a huge importance. In this paper we consider the variety of issues, often grouped under term temporal data warehousing, implied by the need for accurately describing how information changes over time in data warehousing systems. We recognize that, with reference to a three-levels architecture, these issues can be classified into some topics, namely: handling data/schema changes in the data warehouse, handling data/schema changes in the data mart, querying temporal data, and designing temporal data warehouses. After introducing the main concepts and terminology of temporal databases, we separately survey these topics. Finally, we discuss the open research issues also in connection with their implementation on commercial tools.

2009 ◽  
pp. 221-237 ◽  
Author(s):  
Matteo Golfarelli ◽  
Stefano Rizzi

Data warehouses are information repositories specialized in supporting decision making. Since the decisional process typically requires an analysis of historical trends, time and its management acquire a huge importance. In this paper we consider the variety of issues, often grouped under term temporal data warehousing, implied by the need for accurately describing how information changes over time in data warehousing systems. We recognize that, with reference to a three-levels architecture, these issues can be classified into some topics, namely: handling data/schema changes in the data warehouse, handling data/schema changes in the data mart, querying temporal data, and designing temporal data warehouses. After introducing the main concepts and terminology of temporal databases, we separately survey these topics. Finally, we discuss the open research issues also in connection with their implementation on commercial tools.


Author(s):  
Oscar Romero ◽  
Alberto Abelló

In the last years, data warehousing systems have gained relevance to support decision making within organizations. The core component of these systems is the data warehouse and nowadays it is widely assumed that the data warehouse design must follow the multidimensional paradigm. Thus, many methods have been presented to support the multidimensional design of the data warehouse.The first methods introduced were requirement-driven but the semantics of the data warehouse (since the data warehouse is the result of homogenizing and integrating relevant data of the organization in a single, detailed view of the organization business) require to also consider the data sources during the design process. Considering the data sources gave rise to several data-driven methods that automate the data warehouse design process, mainly, from relational data sources. Currently, research on multidimensional modeling is still a hot topic and we have two main research lines. On the one hand, new hybrid automatic methods have been introduced proposing to combine data-driven and requirement-driven approaches. These methods focus on automating the whole process and improving the feedback retrieved by each approach to produce better results. On the other hand, some new approaches focus on considering alternative scenarios than relational sources. These methods also consider (semi)-structured data sources, such as ontologies or XML, that have gained relevance in the last years. Thus, they introduce innovative solutions for overcoming the heterogeneity of the data sources. All in all, we discuss the current scenario of multidimensional modeling by carrying out a survey of multidimensional design methods. We present the most relevant methods introduced in the literature and a detailed comparison showing the main features of each approach.


Author(s):  
Beixin ("Betsy") Lin ◽  
Yu Hong ◽  
Zu-Hsu Lee

A data warehouse is a large electronic repository of information that is generated and updated in a structured manner by an enterprise over time to aid business intelligence and to support decision making. Data stored in a data warehouse is non-volatile and time variant and is organized by subjects in a manner to support decision making (Inmon et al., 2001). Data warehousing has been increasingly adopted by enterprises as the backbone technology for business intelligence reporting and query performance has become the key to the successful implementation of data warehouses. According to a survey of 358 businesses on reporting and end-user query tools, conducted by Appfluent Technology, data warehouse performance significantly affects the Return on Investment (ROI) on Business Intelligence (BI) systems and directly impacts the bottom line of the systems (Appfluent Technology, 2002). Even though in some circumstances it is very difficult to measure the benefits of BI projects in terms of ROI or dollar figures, management teams are still eager to have a “single version of the truth,” better information for strategic and tactical decision making, and more efficient business processes by using BI solutions (Eckerson, 2003). Dramatic increases in data volumes over time and the mixed quality of data can adversely affect the performance of a data warehouse. Some data may become outdated over time and can be mixed with data that are still valid for decision making. In addition, data are often collected to meet potential requirements, but may never be used. Data warehouses also contain external data (e.g. demographic, psychographic, etc.) to support a variety of predictive data mining activities. All these factors contribute to the massive growth of data volume. As a result, even a simple query may become burdensome to process and cause overflowing system indices (Inmon et al., 1998). Thus, exploring the techniques of performance tuning becomes an important subject in data warehouse management.


2008 ◽  
pp. 2364-2370
Author(s):  
Janet Delve

Data Warehousing is now a well-established part of the business and scientific worlds. However, up until recently, data warehouses were restricted to modeling essentially numerical data – examples being sales figures in the business arena (e.g. Wal-Mart’s data warehouse) and astronomical data (e.g. SKICAT) in scientific research, with textual data providing a descriptive rather than a central role. The lack of ability of data warehouses to cope with mainly non-numeric data is particularly problematic for humanities1 research utilizing material such as memoirs and trade directories. Recent innovations have opened up possibilities for non-numeric data warehouses, making them widely accessible to humanities research for the first time. Due to its irregular and complex nature, humanities research data is often difficult to model and manipulating time shifts in a relational database is problematic as is fitting such data into a normalized data model. History and linguistics are exemplars of areas where relational databases are cumbersome and which would benefit from the greater freedom afforded by data warehouse dimensional modeling.


2017 ◽  
Vol 19 (1) ◽  
pp. 17-28 ◽  
Author(s):  
Siew-Phek T. Su ◽  
Ashwin Needamangala

Data warehousing technology has been defined by John Ladley as "a set of methods, techniques, and tools that are leveraged together and used to produce a vehicle that delivers data to end users on an integrated platform." (1) This concept h s been applied increasingly by industries worldwide to develop data warehouses for decision support and knowledge discovery. In the academic sector, several universities have developed data warehouses containing the universities' financial, payroll, personnel, budget, and student data. (2) These data warehouses across all industries and academia have met with varying degrees of success. Data warehousing technology and its related issues have been widely discussed and published. (3) Little has been done, however, on the application of this cutting edge technology in the library environment using library data.


1998 ◽  
Vol 07 (02n03) ◽  
pp. 215-247 ◽  
Author(s):  
MATTEO GOLFARELLI ◽  
DARIO MAIO ◽  
STEFANO RIZZI

Data warehousing systems enable enterprise managers to acquire and integrate information from heterogeneous sources and to query very large databases efficiently. Building a data warehouse requires adopting design and implementation techniques completely different from those underlying operational information systems. Though most scientific literature on the design of data warehouses concerns their logical and physical models, an accurate conceptual design is the necessary foundations for building a DW which is well-documented and fully satisfies requirements. In this paper we formalize a graphical conceptual model for data warehouses, called Dimensional Fact model, and propose a semi-automated methodology to build it from the pre-existing (conceptual or logical) schemes describing the enterprise relational database. The representation of reality built using our conceptual model consists of a set of fact schemes whose basic elements are facts, measures, attributes, dimensions and hierarchies; other features which may be represented on fact schemes are the additivity of fact attributes along dimensions, the optionality of dimension attributes and the existence of non-dimension attributes. Compatible fact schemes may be overlapped in order to relate and compare data for drill-across queries. Fact schemes should be integrated with information of the conjectured workload, to be used as the input of logical and physical design phases; to this end, we propose a simple language to denote data warehouse queries in terms of sets of fact instances.


Author(s):  
Roelien Goede

Information systems and in particular data warehouses are very expensive systems to develop. It is therefore not advisable to experiment with ideas too different from current practices. This makes it difficult to apply prescriptive theories in an existing field. From theoretical considerations one might want to develop a data warehouse according to another method such as critical systems thinking methodology. It is however very difficult to persuade data warehouse practitioners to attempt such an experiment. This might be because they would rather adhere to known practices or that they are not sufficiently knowledgeable on critical systems thinking (or any other prescriptive theory) to apply it to such an expensive project. This paper describes a method in which prescriptive theories may be used descriptively to analyse their applicability in a specific field of application. The proposed method is used to understand the practices of the data warehouse discipline from the perspectives of the systems thinking discipline. It is also indicated how this method could be used in other studies where the behaviour of participants is viewed from a point of view of which the detail are unknown to the participants.Keywords: Data warehousing, Systems thinking, Prescriptive theory, Descriptive theory, Interpretative research. Disciplines: Information technology, systems theory, data warehousing, hermeneutics


2009 ◽  
pp. 702-724
Author(s):  
Colleen Cunningham ◽  
Il-Yeol Song ◽  
Peter P. Chen

CRM is a strategy that integrates concepts of knowledge management, data mining, and data warehousing in order to support an organization’s decision-making process to retain long-term and profitable relationships with its customers. This research is part of a long-term study to examine systematically CRM factors that affect design decisions for CRM data warehouses in order to build a taxonomy of CRM analyses and to determine the impact of those analyses on CRM data warehousing design decisions. This article presents the design implications that CRM poses to data warehousing and then proposes a robust multidimensional starter model that supports CRM analyses. Additional research contributions include the introduction of two new measures, percent success ratio and CRM suitability ratio by which CRM models can be evaluated, the identification of and classification of CRM queries, and a preliminary heuristic for designing data warehouses to support CRM analyses.


2011 ◽  
pp. 731-752
Author(s):  
Colleen Cunningham ◽  
Il-Yeol Song ◽  
Peter P. Chen

CRM is a strategy that integrates concepts of knowledge management, data mining, and data warehousing in order to support an organization’s decision-making process to retain long-term and profitable relationships with its customers. This research is part of a long-term study to examine systematically CRM factors that affect design decisions for CRM data warehouses in order to build a taxonomy of CRM analyses and to determine the impact of those analyses on CRM data warehousing design decisions. This article presents the design implications that CRM poses to data warehousing and then proposes a robust multidimensional starter model that supports CRM analyses. Additional research contributions include the introduction of two new measures, percent success ratio and CRM suitability ratio by which CRM models can be evaluated, the identification of and classification of CRM queries, and a preliminary heuristic for designing data warehouses to support CRM analyses.


2008 ◽  
pp. 2749-2761
Author(s):  
Hugh J. Watson ◽  
Barbara H. Wixom ◽  
Dale L. Goodhue

Data warehouses are helping resolve a major problem that has plagued decision support applications over the years — a lack of good data. Top management at 3M realized that the company had to move from being product-centric to being customer savvy. In response, 3M built a terabyte data warehouse (global enterprise data warehouse) that provides thousands of 3M employees with real-time access to accurate, global, detailed information. The data warehouse underlies new Web-based customer services that are dynamically generated based on warehouse information. There are useful lessons that were learned at 3M during their years of developing the data warehouse.


Sign in / Sign up

Export Citation Format

Share Document