Query Interaction Based Approach for Horizontal Data Partitioning

2015 ◽  
Vol 11 (2) ◽  
pp. 44-61 ◽  
Author(s):  
Ladjel Bellatreche ◽  
Amira Kerkad

With the explosion of data, several applications are designed around analytical aspects, with data warehousing technology at the heart of the construction chain. The exploitation of this data warehouse is usually performed by the use of complex queries involving selections, joins and aggregations. These queries bring the following characteristics: (1) their routinely aspects, (2) their large number, and (3) the high operation sharing between queries. This interaction has been largely identified in the context of multi-query optimization, where graph data structures were proposed to capture it. Also during the physical design, the structures have been used to select redundant optimization structures such as materialized views and indexes. Horizontal data partitioning (HDP) is another non-redundant optimization structure that can be selected in the physical design phase. It is a pre-condition for designing extremely large databases in several environments: centralized, distributed, parallel and cloud. It aims to reduce the cost of the above operations. In HDP, the optimization space of potential candidates for partitioning grows exponentially with the problem size making the problem NP-hard. This paper proposes a new approach based on query interactions to select a partitioning schema of a data warehouse in a divide and conquer manner to achieve an improved trade-off between the optimization algorithm's speed and the quality of the solution. The effectiveness of our approach is proven through a validation using the Star Schema Benchmark (100 GB) on Oracle11g.

2003 ◽  
Vol 12 (03) ◽  
pp. 325-363 ◽  
Author(s):  
Joseph Fong ◽  
Qing Li ◽  
Shi-Ming Huang

Data warehouse contains vast amount of data to support complex queries of various Decision Support Systems (DSSs). It needs to store materialized views of data, which must be available consistently and instantaneously. Using a frame metadata model, this paper presents an architecture of a universal data warehousing with different data models. The frame metadata model represents the metadata of a data warehouse, which structures an application domain into classes, and integrates schemas of heterogeneous databases by capturing their semantics. A star schema is derived from user requirements based on the integrated schema, catalogued in the metadata, which stores the schema of relational database (RDB) and object-oriented database (OODB). Data materialization between RDB and OODB is achieved by unloading source database into sequential file and reloading into target database, through which an object relational view can be defined so as to allow the users to obtain the same warehouse view in different data models simultaneously. We describe our procedures of building the relational view of star schema by multidimensional SQL query, and the object oriented view of the data warehouse by Online Analytical Processing (OLAP) through method call, derived from the integrated schema. To validate our work, an application prototype system has been developed in a product sales data warehousing domain based on this approach.


2010 ◽  
Vol 29-32 ◽  
pp. 1133-1138 ◽  
Author(s):  
Li Juan Zhou ◽  
Hai Jun Geng ◽  
Ming Sheng Xu

A data warehouse stores materialized views of data from one or more sources, with the purpose of efficiently implementing decision-support or OLAP queries. Materialized view selection is one of the crucial decisions in designing a data warehouse for optimal efficiency. The goal is to select an appropriate set of views that minimizes sum of the query response time and the cost of maintaining the selected views, given a limited amount of resource, e.g., materialization time, storage space, etc. In this article, we present an improved PGA algorithm to accomplish the view selection problem; the experiments show that our proposed algorithm shows it’s superior.


Author(s):  
Pighin Maurizio ◽  
Ieronutti Lucio

The design and configuration of a data warehouse can be difficult tasks especially in the case of very large databases and in the presence of redundant information. In particular, the choice of which attributes have to be considered as dimensions and measures can be not trivial and it can heavily influence the effectiveness of the final system. In this article, we propose a methodology targeted at supporting the design and deriving information on the total quality of the final data warehouse. We tested our proposal on three real-world commercial ERP databases.


2009 ◽  
pp. 615-636
Author(s):  
Maurizio Pighin ◽  
Lucio Ieronutti

The design and configuration of a data warehouse can be difficult tasks especially in the case of very large databases and in the presence of redundant information. In particular, the choice of which attributes have to be considered as dimensions and measures can be not trivial and it can heavily influence the effectiveness of the final system. In this article, we propose a methodology targeted at supporting the design and deriving information on the total quality of the final data warehouse. We tested our proposal on three real-word commercial ERP databases.


2014 ◽  
Vol 926-930 ◽  
pp. 3165-3170
Author(s):  
Jian Bing Xiahou ◽  
Qian Qian Wei ◽  
Xiao Na Deng ◽  
Xiao Wei Liu

Materialized view is an effective mothed for improving the efficiency of queries in data warehouse system,and materialized views selection problem is one of the most important decisions in designing a data warehouse.This paper begins with a brief introduction to materialized view and study of the existing materialized viewalgorithm.Then in order toselect an appropriate set of views that minimizes total query response timeand the cost of maintaining the selected views under a limitedstorage space, a hybrid algorithm combined with the advantages of ant colony algorithm and immune genetic algorithm is proposed.Inaddition,analyze the shortcomings of this algorithm and propose some improvement ideas, which optimize the efficiency of algorithm to some extent.


2010 ◽  
pp. 397-417
Author(s):  
Maurizio Pighin ◽  
Lucio Ieronutti

The design and configuration of a data warehouse can be difficult tasks especially in the case of very large databases and in the presence of redundant information. In particular, the choice of which attributes have to be considered as dimensions and measures can be not trivial and it can heavily influence the effectiveness of the final system. In this article, we propose a methodology targeted at supporting the design and deriving information on the total quality of the final data warehouse. We tested our proposal on three real-word commercial ERP databases.


Author(s):  
Nur Maimun ◽  
Jihan Natassa ◽  
Wen Via Trisna ◽  
Yeye Supriatin

The accuracy in administering the diagnosis code was the important matter for medical recorder, quality of data was the most important thing for health information management of medical recorder. This study aims to know the coder competency for accuracy and precision of using ICD 10 at X Hospital in Pekanbaru. This study was a qualitative method with case study implementation from five informan. The result show that medical personnel (doctor) have never received a training about coding, doctors writing that hard and difficult to read, failure for making diagnoses code or procedures, doctor used an usual abbreviations that are not standard, theres still an officer who are not understand about the nomenclature and mastering anatomy phatology, facilities and infrastructure were supported for accuracy and precision of the existing code. The errors of coding always happen because there is a human error. The accuracy and precision in coding very influence against the cost of INA CBGs, medical and the committee did most of the work in the case of severity level III, while medical record had a role in monitoring or evaluation of coding implementation. If there are resumes that is not clearly case mix team check file needed medical record the result the diagnoses or coding for conformity. Keywords: coder competency, accuracy and precision of coding, ICD 10


2017 ◽  
pp. 139-145
Author(s):  
R. I. Hamidullin ◽  
L. B. Senkevich

A study of the quality of the development of estimate documentation on the cost of construction at all stages of the implementation of large projects in the oil and gas industry is conducted. The main problems that arise in construction organizations are indicated. The analysis of the choice of the perfect methodology of mathematical modeling of the investigated business process for improving the activity of budget calculations, conducting quality assessment of estimates and criteria for automation of design estimates is performed.


2015 ◽  
Vol 6 (1) ◽  
pp. 50-57
Author(s):  
Rizqa Raaiqa Bintana ◽  
Putri Aisyiyah Rakhma Devi ◽  
Umi Laili Yuhana

The quality of the software can be measured by its return on investment. Factors which may affect the return on investment (ROI) is the tangible factors (such as the cost) dan intangible factors (such as the impact of software to the users or stakeholder). The factor of the software itself are assessed through reviewing, testing, process audit, and performance of software. This paper discusses the consideration of return on investment (ROI) assessment criteria derived from the software and its users. These criteria indicate that the approach may support a rational consideration of all relevant criteria when evaluating software, and shows examples of actual return on investment models. Conducted an analysis of the assessment criteria that affect the return on investment if these criteria have a disproportionate effort that resulted in a return on investment of a software decreased. Index Terms - Assessment criteria, Quality assurance, Return on Investment, Software product


Sign in / Sign up

Export Citation Format

Share Document