An Experimental Replication With Data Warehouse Metrics

2008 ◽  
pp. 408-428
Author(s):  
Manuel Serrano ◽  
Coral Calero ◽  
Mario Piattini

Data warehouses are large repositories that integrate data from several sources for analysis and decision support. Data warehouse quality is crucial, because a bad data warehouse design may lead to the rejection of the decision support system or may result in non-productive decisions. In the last years, we have been working on the definition and validation of software metrics in order to assure data warehouse quality. Some of the metrics are adapted directly from previous ones defined for relational databases, and others are specific for data warehouses. In this paper, we present part of the empirical work we have developed in order to know if the proposed metrics can be used as indicators of data warehouse quality. Previously, we have developed an experiment and its replication, and in this paper, we present the second replication we have made with the purpose of assessing data warehouse maintainability. As a result of the whole empirical work, we have obtained a subset of the proposed metrics that seem to be good indicators of data warehouse quality.

Author(s):  
François Pinet ◽  
Myoung-Ah Kang ◽  
Kamal Boulil ◽  
Sandro Bimonte ◽  
Gil De Sousa ◽  
...  

Recent research works propose using Object-Oriented (OO) approaches, such as UML to model data warehouses. This paper overviews these recent OO techniques, describing the facts and different analysis dimensions of the data. The authors propose a tutorial of the Object Constraint Language (OCL) and show how this language can be used to specify constraints in OO-based models of data warehouses. Previously, OCL has been only applied to describe constraints in software applications and transactional databases. As such, the authors demonstrate in this paper how to use OCL to represent the different types of data warehouse constraints. This paper helps researchers working in the fields of business intelligence and decision support systems, who wish to learn about the major possibilities that OCL offer in the context of data warehouses. The authors also provide general information about the possible types of implementation of multi-dimensional models and their constraints.


2008 ◽  
pp. 2364-2370
Author(s):  
Janet Delve

Data Warehousing is now a well-established part of the business and scientific worlds. However, up until recently, data warehouses were restricted to modeling essentially numerical data – examples being sales figures in the business arena (e.g. Wal-Mart’s data warehouse) and astronomical data (e.g. SKICAT) in scientific research, with textual data providing a descriptive rather than a central role. The lack of ability of data warehouses to cope with mainly non-numeric data is particularly problematic for humanities1 research utilizing material such as memoirs and trade directories. Recent innovations have opened up possibilities for non-numeric data warehouses, making them widely accessible to humanities research for the first time. Due to its irregular and complex nature, humanities research data is often difficult to model and manipulating time shifts in a relational database is problematic as is fitting such data into a normalized data model. History and linguistics are exemplars of areas where relational databases are cumbersome and which would benefit from the greater freedom afforded by data warehouse dimensional modeling.


Author(s):  
Johann Eder ◽  
Karl Wiggisser

Data Warehouses typically are building blocks of decision support systems in companies and public administration. The data contained in a data warehouse is analyzed by means of OnLine Analytical Processing tools, which provide sophisticated features for aggregating and comparing data. Decision support applications depend on the reliability and accuracy of the contained data. Typically, a data warehouse does not only comprise the current snapshot data but also historical data to enable, for instance, analysis over several years. And, as we live in a changing world, one criterion for the reliability and accuracy of the results of such long period queries is their comparability. Whereas data warehouse systems are well prepared for changes in the transactional data, they are, surprisingly, not able to deal with changes in the master data. Nonetheless, such changes do frequently occur. The crucial point for supporting changes is, first of all, being aware of their existence. Second, once you know that a change took place, it is important to know which change (i.e., knowing about differences between versions and relations between the elements of different versions). For data warehouses this means that changes are identified and represented, validity of data and structures are recorded and this knowledge is used for computing correct results for OLAP queries. This chapter is intended to motivate the need for powerful maintenance mechanisms for data warehouse cubes. It presents some basic terms and definitions for the common understanding and introduces the different aspects of data warehouse maintenance. Furthermore, several approaches addressing the problem are presented and classified by their capabilities.


Author(s):  
Maurizio Pighin ◽  
Lucio Ieronutti

Data Warehouses are increasingly used by commercial organizations to extract, from a huge amount of transactional data, concise information useful for supporting decision processes. However, the task of designing a data warehouse and evaluating its effectiveness is not trivial, especially in the case of large databases and in presence of redundant information. The meaning and the quality of selected attributes heavily influence the data warehouse’s effectiveness and the quality of derived decisions. Our research is focused on interactive methodologies and techniques targeted at supporting the data warehouse design and evaluation by taking into account the quality of initial data. In this chapter we propose an approach for supporting the data warehouses development and refinement, providing practical examples and demonstrating the effectiveness of our solution. Our approach is mainly based on two phases: the first one is targeted at interactively guiding the attributes selection by providing quantitative information measuring different statistical and syntactical aspects of data, while the second phase, based on a set of 3D visualizations, gives the opportunity of run-time refining taken design choices according to data examination and analysis. For experimenting proposed solutions on real data, we have developed a tool, called ELDA (EvaLuation DAta warehouse quality), that has been used for supporting the data warehouse design and evaluation.


2017 ◽  
Vol 19 (1) ◽  
pp. 17-28 ◽  
Author(s):  
Siew-Phek T. Su ◽  
Ashwin Needamangala

Data warehousing technology has been defined by John Ladley as "a set of methods, techniques, and tools that are leveraged together and used to produce a vehicle that delivers data to end users on an integrated platform." (1) This concept h s been applied increasingly by industries worldwide to develop data warehouses for decision support and knowledge discovery. In the academic sector, several universities have developed data warehouses containing the universities' financial, payroll, personnel, budget, and student data. (2) These data warehouses across all industries and academia have met with varying degrees of success. Data warehousing technology and its related issues have been widely discussed and published. (3) Little has been done, however, on the application of this cutting edge technology in the library environment using library data.


2008 ◽  
pp. 2749-2761
Author(s):  
Hugh J. Watson ◽  
Barbara H. Wixom ◽  
Dale L. Goodhue

Data warehouses are helping resolve a major problem that has plagued decision support applications over the years — a lack of good data. Top management at 3M realized that the company had to move from being product-centric to being customer savvy. In response, 3M built a terabyte data warehouse (global enterprise data warehouse) that provides thousands of 3M employees with real-time access to accurate, global, detailed information. The data warehouse underlies new Web-based customer services that are dynamically generated based on warehouse information. There are useful lessons that were learned at 3M during their years of developing the data warehouse.


Author(s):  
Janet Delve

Data Warehousing is now a well-established part of the business and scientific worlds. However, up until recently, data warehouses were restricted to modeling essentially numerical data – examples being sales figures in the business arena (in say Wal-Mart’s data warehouse (Westerman, 2000)) and astronomical data (for example SKICAT) in scientific research, with textual data providing a descriptive rather than a central analytic role. The lack of ability of data warehouses to cope with mainly non-numeric data is particularly problematic for humanities1 research utilizing material such as memoirs and trade directories. Recent innovations have opened up possibilities for ‘non-numeric’ data warehouses, making them widely accessible to humanities research for the first time. Due to its irregular and complex nature, humanities research data is often difficult to model, and manipulating time shifts in a relational database is problematic as is fitting such data into a normalized data model. History and linguistics are exemplars of areas where relational databases are cumbersome and which would benefit from the greater freedom afforded by data warehouse dimensional modeling.


2018 ◽  
Vol 6 (3) ◽  
pp. 1-6
Author(s):  
Valdrin Haxhiu

Data warehouses are a collection of several databases, whose goal is to help different companies and corporations make important decisions about their activities. These decisions are taken from the analyses that are made to the data within the data warehouse. These data are taken from data that companies and corporations collect on daily basis from their branches that may be located in different cities, regions, states and continents. Data that are entered to data warehouses are historical data and they represent that part of data that is important for making decisions. These data go under a transformation process in order to accommodate with the structure of the objects within the databases in the data warehouse. This is done because the structure of the relational databases is not similar with the structure of the databases (multidimensional databases) within the data warehouse. The first ones are optimized for transactions on daily basis like: entering, changing, deleting and retrieving data through simple queries, the second ones are optimized for retrieving data through multidimensional queries, which enable us to extract important information. This information helps to make important decisions by learning which are the weak points and the strong points of the company, in order to invest more on the weak points and to strengthen the strong points, increasing the profits of the company. The goal of this paper is to treat data analyses for decision making from a data warehouse by using OLAP (online analytical processing) analysis. For this treatment we used the Analysis Services of Microsoft SQL Server 2016 platform. We analyzed the data of an IT Store with branches in different cities in Kosovo and came to a conclusion for some sales trends. This paper emphasizes the role of data warehouses in decision making.


Data Mining ◽  
2013 ◽  
pp. 1422-1448
Author(s):  
Fadila Bentayeb ◽  
Nora Maïz ◽  
Hadj Mahboubi ◽  
Cécile Favre ◽  
Sabine Loudcher ◽  
...  

Research in data warehousing and OLAP has produced important technologies for the design, management, and use of Information Systems for decision support. With the development of Internet, the availability of various types of data has increased. Thus, users require applications to help them obtaining knowledge from the Web. One possible solution to facilitate this task is to extract information from the Web, transform and load it to a Web Warehouse, which provides uniform access methods for automatic processing of the data. In this chapter, we present three innovative researches recently introduced to extend the capabilities of decision support systems, namely (1) the use of XML as a logical and physical model for complex data warehouses, (2) associating data mining to OLAP to allow elaborated analysis tasks for complex data and (3) schema evolution in complex data warehouses for personalized analyses. Our contributions cover the main phases of the data warehouse design process: data integration and modeling, and user driven-OLAP analysis.


2011 ◽  
pp. 566-583
Author(s):  
Johann Eder ◽  
Karl Wiggisser

Data Warehouses typically are building blocks of decision support systems in companies and public administration. The data contained in a data warehouse is analyzed by means of OnLine Analytical Processing tools, which provide sophisticated features for aggregating and comparing data. Decision support applications depend on the reliability and accuracy of the contained data. Typically, a data warehouse does not only comprise the current snapshot data but also historical data to enable, for instance, analysis over several years. And, as we live in a changing world, one criterion for the reliability and accuracy of the results of such long period queries is their comparability. Whereas data warehouse systems are well prepared for changes in the transactional data, they are, surprisingly, not able to deal with changes in the master data. Nonetheless, such changes do frequently occur. The crucial point for supporting changes is, first of all, being aware of their existence. Second, once you know that a change took place, it is important to know which change (i.e., knowing about differences between versions and relations between the elements of different versions). For data warehouses this means that changes are identified and represented, validity of data and structures are recorded and this knowledge is used for computing correct results for OLAP queries. This chapter is intended to motivate the need for powerful maintenance mechanisms for data warehouse cubes. It presents some basic terms and definitions for the common understanding and introduces the different aspects of data warehouse maintenance. Furthermore, several approaches addressing the problem are presented and classified by their capabilities.


Sign in / Sign up

Export Citation Format

Share Document