scholarly journals Data Warehouse Performance: Optimization by Double Partitioning of Materialized Views

2021 ◽  
Vol 174 (14) ◽  
pp. 17-19
Author(s):  
Mohamed El Emine Abdel Wedoud ◽  
Mohamed Larbi Benmaati ◽  
Emany Sidi
Author(s):  
Hadj Mahboubi ◽  
Jérôme Darmont

XML data warehouses form an interesting basis for decision-support applications that exploit complex data. However, native-XML database management systems (DBMSs) currently bear limited performances and it is necessary to research for ways to optimize them. In this chapter, the authors present two such techniques. First, they propose an XML join index that is specifically adapted to the multidimensional architecture of XML warehouses. It eliminates join operations while preserving the information contained in the original warehouse. Second, the authors present a strategy for selecting XML materialized views by clustering the query workload. To validate these proposals, the authors measure the response time of a set of decision-support XQueries over an XML data warehouse, with and without using their optimization techniques. The authors’ experimental results demonstrate their efficiency, even when queries are complex and data are voluminous.


2003 ◽  
Vol 12 (03) ◽  
pp. 325-363 ◽  
Author(s):  
Joseph Fong ◽  
Qing Li ◽  
Shi-Ming Huang

Data warehouse contains vast amount of data to support complex queries of various Decision Support Systems (DSSs). It needs to store materialized views of data, which must be available consistently and instantaneously. Using a frame metadata model, this paper presents an architecture of a universal data warehousing with different data models. The frame metadata model represents the metadata of a data warehouse, which structures an application domain into classes, and integrates schemas of heterogeneous databases by capturing their semantics. A star schema is derived from user requirements based on the integrated schema, catalogued in the metadata, which stores the schema of relational database (RDB) and object-oriented database (OODB). Data materialization between RDB and OODB is achieved by unloading source database into sequential file and reloading into target database, through which an object relational view can be defined so as to allow the users to obtain the same warehouse view in different data models simultaneously. We describe our procedures of building the relational view of star schema by multidimensional SQL query, and the object oriented view of the data warehouse by Online Analytical Processing (OLAP) through method call, derived from the integrated schema. To validate our work, an application prototype system has been developed in a product sales data warehousing domain based on this approach.


2010 ◽  
Vol 29-32 ◽  
pp. 1133-1138 ◽  
Author(s):  
Li Juan Zhou ◽  
Hai Jun Geng ◽  
Ming Sheng Xu

A data warehouse stores materialized views of data from one or more sources, with the purpose of efficiently implementing decision-support or OLAP queries. Materialized view selection is one of the crucial decisions in designing a data warehouse for optimal efficiency. The goal is to select an appropriate set of views that minimizes sum of the query response time and the cost of maintaining the selected views, given a limited amount of resource, e.g., materialization time, storage space, etc. In this article, we present an improved PGA algorithm to accomplish the view selection problem; the experiments show that our proposed algorithm shows it’s superior.


2008 ◽  
pp. 3116-3141
Author(s):  
Shi-Ming Huang ◽  
David C. Yen ◽  
Hsiang-Yuan Hsueh

The materialized view approach is widely adopted in implementations of data warehouse systems in or-der for efficiency purposes. In terms of the construction of a materialized data warehouse system, some managerial problems still exist to most developers and users in the view resource maintenance area in particular. Resource redundancy and data inconsistency among materialized views in a data warehouse system is a problem that many developers and users struggle with. In this article, a space-efficient protocol for materialized view maintenance with a global data view on data warehouses with embedded proxies is proposed. In the protocol set, multilevel proxy-based protocols with a data compensating mechanism are provided to certify the consistency and uniqueness of materialized data among data resources and materialized views. The authors also provide a set of evaluation experiences and derivations to verify the feasibility of proposed protocols and mechanisms. With such protocols as proxy services, the performance and space utilization of the materialized view approach will be improved. Furthermore, the consistency issue among materialized data warehouses and heterogeneous data sources can be properly accomplished by applying a dynamic compensating and synchronization mechanism. The trade-off between efficiency, storage consumption, and data validity for view maintenance tasks can be properly balanced.


2008 ◽  
pp. 3085-3115
Author(s):  
Biren Shah ◽  
Karthik Ramachandran ◽  
Vijay Raghavan

Materialized view selection is one of the crucial decisions in designing a data warehouse for optimal efficiency. Static selection of views may materialize certain views that are not beneficial as the data and usage trends change over time. On the contrary, dynamic selection of views works better only for queries demanding a high degree of aggregation. These facts point to the need for a technique that combines the improved response time of the static approach and the automated tuning capability of the dynamic approach. In this article, we propose a hybrid approach for the selection of materialized views. The idea is to partition the collection of all views into a static and dynamic set such that views selected for materialization from the static set are persistent over multiple query (and maintenance) windows, whereas views selected from the dynamic set can be queried and/or replaced on the fly. Highly aggregated views are selected on the fly based on the query access patterns of users, whereas the more detailed static set of views plays a significant role in the efficient maintenance of the dynamic set of views and in answering certain detailed view queries. We prove that our proposed strategy satisfies the monotonicity requirements, which is essential in order for the greedy heuristic to deliver competitive solutions. Experimental results show that our approach outperforms Dynamat, a well-known dynamic view management system that is known to outperform optimal static view selection.


Author(s):  
Ladjel Bellatreche ◽  
Mukesh Mohania

Recently, organizations have increasingly emphasized applications in which current and historical data are analyzed and explored comprehensively, identifying useful trends and creating summaries of the data in order to support high-level decision making. Every organization keeps accumulating data from different functional units, so that they can be analyzed (after integration), and important decisions can be made from the analytical results. Conceptually, a data warehouse is extremely simple. As popularized by Inmon (1992), it is a “subject-oriented, integrated, time-invariant, non-updatable collection of data used to support management decision-making processes and business intelligence”. A data warehouse is a repository into which are placed all data relevant to the management of an organization and from which emerge the information and knowledge needed to effectively manage the organization. This management can be done using data-mining techniques, comparisons of historical data, and trend analysis. For such analysis, it is vital that (1) data should be accurate, complete, consistent, well defined, and time-stamped for informational purposes; and (2) data should follow business rules and satisfy integrity constraints. Designing a data warehouse is a lengthy, time-consuming, and iterative process. Due to the interactive nature of a data warehouse application, having fast query response time is a critical performance goal. Therefore, the physical design of a warehouse gets the lion’s part of research done in the data warehousing area. Several techniques have been developed to meet the performance requirement of such an application, including materialized views, indexing techniques, partitioning and parallel processing, and so forth. Next, we briefly outline the architecture of a data warehousing system.


2003 ◽  
Author(s):  
Lijuan Zhou ◽  
Chi Liu ◽  
Daxin Liu

2001 ◽  
Vol 10 (03) ◽  
pp. 299-326 ◽  
Author(s):  
DIMITRI THEODORATOS ◽  
MOKRANE BOUZEGHOUB

A Data Warehouse (DW) is a large collection of data integrated from multiple distributed autonomous databases and other information sources. A DW can be seen as a set of materialized views defined over the remote source data. Until now research work on DW design is restricted to quantitatively selecting view sets for materialization. However, quality issues in the DW design are neglected. In this paper we suggest a novel statement of the DW design problem that takes into account quality factors. We design a DW system architecture that supports performance and data consistency quality goals. In this framework we present a high level approach that allows to check whether a view selection guaranteeing a data completeness quality goal also satisfies a data currency quality goal. This approach is based on an AND/OR dag representation for multiple queries and views. It also allows determining the minimal change propagation frequencies that satisfy the data currency quality goal along with the optimal query evaluation and change propagation plans. Our results can be directly used for a quality driven design of a DW.


Sign in / Sign up

Export Citation Format

Share Document