scholarly journals Towards Comparative Analysis of Resumption Techniques in ETL

2021 ◽  
Vol 3 (2) ◽  
pp. 82
Author(s):  
Mohammed Muddasir ◽  
Raghuveer K ◽  
Dayanand R

Data warehouses are loaded with data from sources such as operational data bases. Failure of loading process or failure of any of the process such as extraction or transformation is expensive because of the non-availability of data for analysis. With the advent of e-commerce and many real time application analysis of data in real time becomes a norm and hence any misses while the data is being loaded into data warehouse needs to be handled in an efficient and optimized way. The techniques to handle failure of process to populate the data are very much important as the actual loading process. Alternative arrangement needs to be made for in case of failure so that processes of populating the data warehouse are done in time. This paper explores the various ways through which a failed process of populating the data warehouse could be resumed. Various resumption techniques are compared and a novel block based technique is proposed to improve one of the existing resumption techniques.

Author(s):  
Kheri Arionadi Shobirin ◽  
Adi Panca Saputra Iskandar ◽  
Ida Bagus Alit Swamardika

A data warehouse are central repositories of integrated data from one or more disparate sources from operational data in On-Line Transaction Processing (OLTP) system to use in decision making strategy and business intelligent using On-Line Analytical Processing (OLAP) techniques. Data warehouses support OLAP applications by storing and maintaining data in multidimensional format. Multidimensional data models as an integral part of OLAP designed to solve complex query analysis in real time.


2016 ◽  
Vol 6 (6) ◽  
pp. 1241-1244 ◽  
Author(s):  
M. Faridi Masouleh ◽  
M. A. Afshar Kazemi ◽  
M. Alborzi ◽  
A. Toloie Eshlaghy

Extraction, Transformation and Loading (ETL) is introduced as one of the notable subjects in optimization, management, improvement and acceleration of processes and operations in data bases and data warehouses. The creation of ETL processes is potentially one of the greatest tasks of data warehouses and so its production is a time-consuming and complicated procedure. Without optimization of these processes, the implementation of projects in data warehouses area is costly, complicated and time-consuming. The present paper used the combination of parallelization methods and shared cache memory in systems distributed on the basis of data warehouse. According to the conducted assessment, the proposed method exhibited 7.1% speed improvement to kattle optimization instrument and 7.9% to talend instrument in terms of implementation time of the ETL process. Therefore, parallelization could notably improve the ETL process. It eventually caused the management and integration processes of big data to be implemented in a simple way and with acceptable speed.


2008 ◽  
pp. 2749-2761
Author(s):  
Hugh J. Watson ◽  
Barbara H. Wixom ◽  
Dale L. Goodhue

Data warehouses are helping resolve a major problem that has plagued decision support applications over the years — a lack of good data. Top management at 3M realized that the company had to move from being product-centric to being customer savvy. In response, 3M built a terabyte data warehouse (global enterprise data warehouse) that provides thousands of 3M employees with real-time access to accurate, global, detailed information. The data warehouse underlies new Web-based customer services that are dynamically generated based on warehouse information. There are useful lessons that were learned at 3M during their years of developing the data warehouse.


2013 ◽  
Vol 9 (2) ◽  
pp. 21-38 ◽  
Author(s):  
Florian Waas ◽  
Robert Wrembel ◽  
Tobias Freudenreich ◽  
Maik Thiele ◽  
Christian Koncilia ◽  
...  

In a typical BI infrastructure, data, extracted from operational data sources, is transformed, cleansed, and loaded into a data warehouse by a periodic ETL process, typically executed on a nightly basis, i.e., a full day’s worth of data is processed and loaded during off-hours. However, it is desirable to have fresher data for business insights at near real-time. To this end, the authors propose to leverage a data warehouse’s capability to directly import raw, unprocessed records and defer the transformation and data cleaning until needed by pending reports. At that time, the database’s own processing mechanisms can be deployed to process the data on-demand. Event-processing capabilities are seamlessly woven into our proposed architecture. Besides outlining an overall architecture, the authors also developed a roadmap for implementing a complete prototype using conventional database technology in the form of hierarchical materialized views.


2012 ◽  
Vol 11 (9) ◽  
pp. 1310-1315
Author(s):  
Xuedong Du ◽  
Guilin Li ◽  
Jiangtao Ji ◽  
Xiaomei Tan

Author(s):  
Munesh Chandra Trivedi ◽  
Virendra Kumar Yadav ◽  
Avadhesh Kumar Gupta

<p>Data warehouse generally contains both types of data i.e. historical &amp; current data from various data sources. Data warehouse in world of computing can be defined as system created for analysis and reporting of these both types of data. These analysis report is then used by an organization to make decisions which helps them in their growth. Construction of data warehouse appears to be simple, collection of data from data sources into one place (after extraction, transform and loading). But construction involves several issues such as inconsistent data, logic conflicts, user acceptance, cost, quality, security, stake holder’s contradictions, REST alignment etc. These issues need to be overcome otherwise will lead to unfortunate consequences affecting the organization growth. Proposed model tries to solve these issues such as REST alignment, stake holder’s contradiction etc. by involving experts of various domains such as technical, analytical, decision makers, management representatives etc. during initialization phase to better understand the requirements and mapping these requirements to data sources during design phase of data warehouse.</p>


2011 ◽  
pp. 202-216 ◽  
Author(s):  
Hugh J. Watson ◽  
Barbara H. Wixom ◽  
Dale L. Goodhue

Data warehouses are helping resolve a major problem that has plagued decision support applications over the years — a lack of good data. Top management at 3M realized that the company had to move from being product-centric to being customer savvy. In response, 3M built a terabyte data warehouse (global enterprise data warehouse) that provides thousands of 3M employees with real-time access to accurate, global, detailed information. The data warehouse underlies new Web-based customer services that are dynamically generated based on warehouse information. There are useful lessons that were learned at 3M during their years of developing the data warehouse.


Sign in / Sign up

Export Citation Format

Share Document