Query Performance Optimization in XML Data Warehouses

2009 ◽

pp. 693-700 ◽

Cited By ~ 1

Author(s):

Kamel Aouiche ◽

Jérôme Darmont

Keyword(s):

Response Time ◽

Performance Optimization ◽

Data Access ◽

Original Data ◽

Optimization Techniques ◽

Access Time ◽

Materialized Views ◽

Storage Space ◽

Data Warehouses ◽

Query Response Time

Database management systems (DBMSs) require an administrator whose principal tasks are data management, both at the logical and physical levels, as well as performance optimization. With the wide development of databases and data warehouses, minimizing the administration function is crucial. This function includes the selection of suitable physical structures to improve system performance. View materialization and indexing are presumably some of the most effective optimization techniques adopted in relational implementations of data warehouses. Materialized views are physical structures that improve data access time by precomputing intermediary results. Therefore, end-user queries can be efficiently processed through data stored in views and do not need to access the original data. Indexes are also physical structures that allow direct data access. They avoid sequential scans and thereby reduce query response time. Nevertheless, these solutions require additional storage space and entail maintenance overhead. The issue is then to select an appropriate configuration of materialized views and indexes that minimizes both query response time and maintenance cost given a limited storage space. This problem is NP hard (Gupta & Mumick, 2005).

Download Full-text

Innovative Approaches for Efficiently Warehousing Complex Data from the Web

Data Mining ◽

10.4018/978-1-4666-2455-9.ch074 ◽

2013 ◽

pp. 1422-1448

Author(s):

Fadila Bentayeb ◽

Nora Maïz ◽

Hadj Mahboubi ◽

Cécile Favre ◽

Sabine Loudcher ◽

...

Keyword(s):

Data Mining ◽

Decision Support ◽

Data Warehouse ◽

Design Management ◽

Complex Data ◽

Data Warehouses ◽

Process Data ◽

Access Methods ◽

Olap Analysis ◽

The Web

Research in data warehousing and OLAP has produced important technologies for the design, management, and use of Information Systems for decision support. With the development of Internet, the availability of various types of data has increased. Thus, users require applications to help them obtaining knowledge from the Web. One possible solution to facilitate this task is to extract information from the Web, transform and load it to a Web Warehouse, which provides uniform access methods for automatic processing of the data. In this chapter, we present three innovative researches recently introduced to extend the capabilities of decision support systems, namely (1) the use of XML as a logical and physical model for complex data warehouses, (2) associating data mining to OLAP to allow elaborated analysis tasks for complex data and (3) schema evolution in complex data warehouses for personalized analyses. Our contributions cover the main phases of the data warehouse design process: data integration and modeling, and user driven-OLAP analysis.

Download Full-text

A Methodology for Building XML Data Warehouses

Data Warehousing and Mining ◽

10.4018/978-1-59904-951-9.ch037 ◽

2008 ◽

pp. 530-555

Author(s):

Laura Irina Rusu ◽

J. Wenny Rahayu ◽

David Taniar

Keyword(s):

Data Warehouse ◽

Systematic Approach ◽

Data Cleaning ◽

The Other ◽

Data Warehouses ◽

Xml Data ◽

Xml Database ◽

Xml Documents

Developing a data warehouse for XML documents involves two major processes: one of creating it, by processing XML raw documents into a specified data warehouse repository; and the other of querying it, by applying techniques to better answer users’ queries. This paper focuses on the first part; that is identifying a systematic approach for building a data warehouse of XML documents, specifically for transferring data from an underlying XML database into a defined XML data warehouse. The proposed methodology on building XML data warehouses covers processes including data cleaning and integration, summarization, intermediate XML documents, and updating/linking existing documents and creating fact tables. In this paper, we also present a case study on how to put this methodology into practice. We utilise the XQuery technology in all of the above processes.

Download Full-text

Data Warehouse Benchmarking with DWEB

Progressive Methods in Data Warehousing and Business Intelligence ◽

10.4018/978-1-60566-232-9.ch015 ◽

2011 ◽

pp. 302-323

Author(s):

Jérôme Darmont

Keyword(s):

Performance Evaluation ◽

Decision Support ◽

Data Warehouse ◽

Database Management ◽

Ad Hoc ◽

Synthetic Data ◽

Transaction Processing ◽

Data Warehouses ◽

On Line ◽

Analytical Processing

Performance evaluation is a key issue for designers and users of Database Management Systems (DBMSs). Performance is generally assessed with software benchmarks that help, for example test architectural choices, compare different technologies, or tune a system. In the particular context of data warehousing and On-Line Analytical Processing (OLAP), although the Transaction Processing Performance Council (TPC) aims at issuing standard decision-support benchmarks, few benchmarks do actually exist. We present in this chapter the Data Warehouse Engineering Benchmark (DWEB), which allows generating various ad-hoc synthetic data warehouses and workloads. DWEB is fully parameterized to fulfill various data warehouse design needs. However, two levels of parameterization keep it relatively easy to tune. We also expand on our previous work on DWEB by presenting its new Extract, Transform, and Load (ETL) feature, as well as its new execution protocol. A Java implementation of DWEB is freely available online, which can be interfaced with most existing relational DMBSs. To the best of our knowledge, DWEB is the only easily available, up-to-date benchmark for data warehouses.

Download Full-text

Innovative Approaches for Efficiently Warehousing Complex Data from the Web

Business Intelligence Applications and the Web - Advances in Business Information Systems and Analytics ◽

10.4018/978-1-61350-038-5.ch002 ◽

2011 ◽

pp. 26-52 ◽

Cited By ~ 2

Author(s):

Fadila Bentayeb ◽

Nora Maïz ◽

Hadj Mahboubi ◽

Cécile Favre ◽

Sabine Loudcher ◽

...

Keyword(s):

Data Mining ◽

Decision Support ◽

Data Warehouse ◽

Design Management ◽

Complex Data ◽

Data Warehouses ◽

Process Data ◽

Access Methods ◽

Olap Analysis ◽

The Web

Research in data warehousing and OLAP has produced important technologies for the design, management, and use of Information Systems for decision support. With the development of Internet, the availability of various types of data has increased. Thus, users require applications to help them obtaining knowledge from the Web. One possible solution to facilitate this task is to extract information from the Web, transform and load it to a Web Warehouse, which provides uniform access methods for automatic processing of the data. In this chapter, we present three innovative researches recently introduced to extend the capabilities of decision support systems, namely (1) the use of XML as a logical and physical model for complex data warehouses, (2) associating data mining to OLAP to allow elaborated analysis tasks for complex data and (3) schema evolution in complex data warehouses for personalized analyses. Our contributions cover the main phases of the data warehouse design process: data integration and modeling, and user driven-OLAP analysis.

Download Full-text

Materialized View Selection in the Data Warehouse

Applied Mechanics and Materials ◽

10.4028/www.scientific.net/amm.29-32.1133 ◽

2010 ◽

Vol 29-32 ◽

pp. 1133-1138 ◽

Cited By ~ 1

Author(s):

Li Juan Zhou ◽

Hai Jun Geng ◽

Ming Sheng Xu

Keyword(s):

Decision Support ◽

Data Warehouse ◽

Materialized Views ◽

Storage Space ◽

View Selection ◽

Materialized View ◽

Query Response Time ◽

Materialized View Selection ◽

Optimal Efficiency ◽

The Cost

A data warehouse stores materialized views of data from one or more sources, with the purpose of efficiently implementing decision-support or OLAP queries. Materialized view selection is one of the crucial decisions in designing a data warehouse for optimal efficiency. The goal is to select an appropriate set of views that minimizes sum of the query response time and the cost of maintaining the selected views, given a limited amount of resource, e.g., materialization time, storage space, etc. In this article, we present an improved PGA algorithm to accomplish the view selection problem; the experiments show that our proposed algorithm shows it’s superior.

Download Full-text

Using OCL to Model Constraints in Data Warehouses

Technology Diffusion and Adoption ◽

10.4018/978-1-4666-2791-8.ch014 ◽

2013 ◽

pp. 212-224

Author(s):

François Pinet ◽

Myoung-Ah Kang ◽

Kamal Boulil ◽

Sandro Bimonte ◽

Gil De Sousa ◽

...

Keyword(s):

Decision Support ◽

Decision Support Systems ◽

Data Warehouse ◽

Support Systems ◽

General Information ◽

Data Warehouses ◽

Object Constraint Language ◽

Software Applications ◽

Constraint Language ◽

Transactional Databases

Recent research works propose using Object-Oriented (OO) approaches, such as UML to model data warehouses. This paper overviews these recent OO techniques, describing the facts and different analysis dimensions of the data. The authors propose a tutorial of the Object Constraint Language (OCL) and show how this language can be used to specify constraints in OO-based models of data warehouses. Previously, OCL has been only applied to describe constraints in software applications and transactional databases. As such, the authors demonstrate in this paper how to use OCL to represent the different types of data warehouse constraints. This paper helps researchers working in the fields of business intelligence and decision support systems, who wish to learn about the major possibilities that OCL offer in the context of data warehouses. The authors also provide general information about the possible types of implementation of multi-dimensional models and their constraints.

Download Full-text

A Space-Efficient Protocol for Consistency of External View Maintenance on Data Warehouse Systems

Data Warehousing and Mining ◽

10.4018/978-1-59904-951-9.ch198 ◽

2008 ◽

pp. 3116-3141

Author(s):

Shi-Ming Huang ◽

David C. Yen ◽

Hsiang-Yuan Hsueh

Keyword(s):

Data Warehouse ◽

Heterogeneous Data ◽

Materialized Views ◽

Data Warehouses ◽

View Maintenance ◽

Synchronization Mechanism ◽

Materialized View ◽

Heterogeneous Data Sources ◽

Data Validity ◽

Data View

The materialized view approach is widely adopted in implementations of data warehouse systems in or-der for efficiency purposes. In terms of the construction of a materialized data warehouse system, some managerial problems still exist to most developers and users in the view resource maintenance area in particular. Resource redundancy and data inconsistency among materialized views in a data warehouse system is a problem that many developers and users struggle with. In this article, a space-efficient protocol for materialized view maintenance with a global data view on data warehouses with embedded proxies is proposed. In the protocol set, multilevel proxy-based protocols with a data compensating mechanism are provided to certify the consistency and uniqueness of materialized data among data resources and materialized views. The authors also provide a set of evaluation experiences and derivations to verify the feasibility of proposed protocols and mechanisms. With such protocols as proxy services, the performance and space utilization of the materialized view approach will be improved. Furthermore, the consistency issue among materialized data warehouses and heterogeneous data sources can be properly accomplished by applying a dynamic compensating and synchronization mechanism. The trade-off between efficiency, storage consumption, and data validity for view maintenance tasks can be properly balanced.

Download Full-text

Data Warehouse Maintenance, Evolution and Versioning

Data Warehousing Design and Advanced Engineering Applications ◽

10.4018/978-1-60566-756-0.ch010 ◽

2010 ◽

pp. 171-188 ◽

Cited By ~ 1

Author(s):

Johann Eder ◽

Karl Wiggisser

Keyword(s):

Decision Support ◽

Data Warehouse ◽

Building Blocks ◽

Data Warehouses ◽

Master Data ◽

Analytical Processing ◽

The Common ◽

Instance Analysis ◽

Transactional Data ◽

Common Understanding

Data Warehouses typically are building blocks of decision support systems in companies and public administration. The data contained in a data warehouse is analyzed by means of OnLine Analytical Processing tools, which provide sophisticated features for aggregating and comparing data. Decision support applications depend on the reliability and accuracy of the contained data. Typically, a data warehouse does not only comprise the current snapshot data but also historical data to enable, for instance, analysis over several years. And, as we live in a changing world, one criterion for the reliability and accuracy of the results of such long period queries is their comparability. Whereas data warehouse systems are well prepared for changes in the transactional data, they are, surprisingly, not able to deal with changes in the master data. Nonetheless, such changes do frequently occur. The crucial point for supporting changes is, first of all, being aware of their existence. Second, once you know that a change took place, it is important to know which change (i.e., knowing about differences between versions and relations between the elements of different versions). For data warehouses this means that changes are identified and represented, validity of data and structures are recorded and this knowledge is used for computing correct results for OLAP queries. This chapter is intended to motivate the need for powerful maintenance mechanisms for data warehouse cubes. It presents some basic terms and definitions for the common understanding and introduces the different aspects of data warehouse maintenance. Furthermore, several approaches addressing the problem are presented and classified by their capabilities.

Download Full-text

Harvesting Information from a Library Data Warehouse

Information Technology and Libraries ◽

10.6017/ital.v19i1.10070 ◽

2017 ◽

Vol 19 (1) ◽

pp. 17-28 ◽

Cited By ~ 2

Author(s):

Siew-Phek T. Su ◽

Ashwin Needamangala

Keyword(s):

Decision Support ◽

Knowledge Discovery ◽

Data Warehouse ◽

Data Warehousing ◽

End Users ◽

Cutting Edge ◽

Data Warehouses ◽

Student Data ◽

Integrated Platform ◽

Academic Sector

Data warehousing technology has been defined by John Ladley as "a set of methods, techniques, and tools that are leveraged together and used to produce a vehicle that delivers data to end users on an integrated platform." (1) This concept h s been applied increasingly by industries worldwide to develop data warehouses for decision support and knowledge discovery. In the academic sector, several universities have developed data warehouses containing the universities' financial, payroll, personnel, budget, and student data. (2) These data warehouses across all industries and academia have met with varying degrees of success. Data warehousing technology and its related issues have been widely discussed and published. (3) Little has been done, however, on the application of this cutting edge technology in the library environment using library data.

Download Full-text

Query Performance Optimization in XML Data Warehouses

Index and Materialized View Selection in Data Warehouses

Innovative Approaches for Efficiently Warehousing Complex Data from the Web

A Methodology for Building XML Data Warehouses

Data Warehouse Benchmarking with DWEB

Innovative Approaches for Efficiently Warehousing Complex Data from the Web

Materialized View Selection in the Data Warehouse

Using OCL to Model Constraints in Data Warehouses

A Space-Efficient Protocol for Consistency of External View Maintenance on Data Warehouse Systems

Data Warehouse Maintenance, Evolution and Versioning

Harvesting Information from a Library Data Warehouse

Export Citation Format