etl workflow Latest Research Papers

ETL Logs Under a Pattern-Oriented Approach

International Journal of Data Warehousing and Mining ◽

10.4018/ijdwm.2021100102 ◽

2021 ◽

Vol 17 (4) ◽

pp. 29-47

Author(s):

Bruno Oliveira ◽

Óscar Oliveira ◽

Orlando Belo

Keyword(s):

Process Development ◽

Evolutionary Process ◽

Development Teams ◽

Etl Workflow ◽

Oriented Approach

Considering extract-transform-load (ETL) as a complex and evolutionary process, development teams must conscientiously and rigorously create log strategies for retrieving the most value of the information that can be gathered from the events that occur through the ETL workflow. Efficient logging strategies must be structured so that metrics, logs, and alerts can, beyond their troubleshooting capabilities, provide insights about the system. This paper presents a configurable and flexible ETL component for creating logging mechanisms in ETL workflows. A pattern-oriented approach is followed as a way to abstract ETL activities and enable its mapping to physical primitives that can be interpreted by ETL commercial tools.

Data Migration using ETL Workflow

2021 7th International Conference on Advanced Computing and Communication Systems (ICACCS) ◽

10.1109/icaccs51430.2021.9441840 ◽

2021 ◽

Author(s):

N Saranya ◽

R Brindha ◽

N Aishwariya ◽

R Kokila ◽

P Matheswaran ◽

...

Keyword(s):

Data Migration ◽

Etl Workflow

Parallelizing user–defined functions in the ETL workflow using orchestration style sheets

International Journal of Applied Mathematics and Computer Science ◽

10.2478/amcs-2019-0005 ◽

2019 ◽

Vol 29 (1) ◽

pp. 69-79 ◽

Cited By ~ 1

Author(s):

Syed Muhammad Fawad Ali ◽

Johannes Mey ◽

Maik Thiele

Keyword(s):

Parallel Computing ◽

Design Pattern ◽

Large Scale ◽

Distributed Environment ◽

Data Intensive ◽

Etl Workflow ◽

Distributed And Parallel Computing ◽

Taking Care

Abstract Today’s ETL tools provide capabilities to develop custom code as user-defined functions (UDFs) to extend the expressiveness of the standard ETL operators. However, while this allows us to easily add new functionalities, it also comes with the risk that the custom code is not intended to be optimized, e.g., by parallelism, and for this reason, it performs poorly for data-intensive ETL workflows. In this paper we present a novel framework, which allows the ETL developer to choose a design pattern in order to write parallelizable code and generates a configuration for the UDFs to be executed in a distributed environment. This enables ETL developers with minimum expertise in distributed and parallel computing to develop UDFs without taking care of parallelization configurations and complexities. We perform experiments on large-scale datasets based on TPC-DS and BigBench. The results show that our approach significantly reduces the effort of ETL developers and at the same time generates efficient parallel configurations to support complex and data-intensive ETL tasks.

Towards a Cost Model to Optimize User-Defined Functions in an ETL Workflow Based on User-Defined Performance Metrics

Advances in Databases and Information Systems - Lecture Notes in Computer Science ◽

10.1007/978-3-030-28730-6_27 ◽

2019 ◽

pp. 441-456

Author(s):

Syed Muhammad Fawad Ali ◽

Robert Wrembel

Keyword(s):

Performance Metrics ◽

Cost Model ◽

Etl Workflow

ETL workflow reparation by means of case-based reasoning

Information Systems Frontiers ◽

10.1007/s10796-016-9732-0 ◽

2017 ◽

Vol 20 (1) ◽

pp. 21-43 ◽

Cited By ~ 8

Author(s):

Artur Wojciechowski

Keyword(s):

Case Based Reasoning ◽

Etl Workflow ◽

Case Based

ETL Workflow Generation for Offloading Dormant Data from the Data Warehouse to Hadoop

Issues In Information Systems ◽

10.48009/1_iis_2015_91-101 ◽

2015 ◽

Keyword(s):

Data Warehouse ◽

Etl Workflow

Revised Framework for ETL Workflow Management for Efficient Business Decision-Making

International Journal of Computer Theory and Engineering ◽

10.7763/ijcte.2013.v5.734 ◽

2013 ◽

pp. 484-487

Author(s):

Saifur Rehman Malik ◽

Azra Shamim ◽

Zanib Bibi ◽

Sajid Ullah Khan ◽

Shabir Ahmad Gorsi

Keyword(s):

Decision Making ◽

Workflow Management ◽

Business Decision ◽

Etl Workflow

A schema aware ETL workflow generator

Information Systems Frontiers ◽

10.1007/s10796-012-9352-2 ◽

2012 ◽

Vol 16 (3) ◽

pp. 453-471 ◽

Cited By ~ 2

Author(s):

Naiqiao Du ◽

Xiaojun Ye ◽

Jianmin Wang

Keyword(s):

Etl Workflow

An MAS-based and fault-tolerant distributed ETL workflow engine

Proceedings of the 2012 IEEE 16th International Conference on Computer Supported Cooperative Work in Design (CSCWD) ◽

10.1109/cscwd.2012.6221797 ◽

2012 ◽

Cited By ~ 2

Author(s):

Jinluan Huang ◽

Chaozhen Guo

Keyword(s):

Fault Tolerant ◽

Workflow Engine ◽

Etl Workflow

Data Warehouse Refreshment

Data Warehouses and OLAP ◽

10.4018/987-1-59904-364-7.ch005 ◽

2011 ◽

pp. 111-135 ◽

Cited By ~ 6

Author(s):

Alkis Simitsis ◽

Panos Vassiliadis ◽

Spiros Skiadopoulos ◽

Timos Sellis

Keyword(s):

Data Warehouse ◽

Advantages And Disadvantages ◽

Early Stages ◽

To Come ◽

Back Stage ◽

Etl Workflow

In the early stages of a data warehouse project, the designers/administrators have to come up with a decision concerning the design and deployment of the back-stage architecture. The possible options are (a) the usage of a commercial ETL tool, or (b) the development of an in-house ETL prototype. Both cases have advantages and disadvantages. However, in both cases the design and modeling of the ETL workflows have the same characteristics. The scope of this chapter is to indicate the main challenges, issues, and problems concerning the manufacturing of ETL workflows, in order to assist the designers/administrators to decide which solution suits better to their data warehouse project and to help them construct an efficient, robust and evolvable ETL workflow that implements the refreshment of their warehouse.

etl workflow
Recently Published Documents

TOTAL DOCUMENTS

H-INDEX

ETL Logs Under a Pattern-Oriented Approach

Data Migration using ETL Workflow

Parallelizing user–defined functions in the ETL workflow using orchestration style sheets

Towards a Cost Model to Optimize User-Defined Functions in an ETL Workflow Based on User-Defined Performance Metrics

ETL workflow reparation by means of case-based reasoning

ETL Workflow Generation for Offloading Dormant Data from the Data Warehouse to Hadoop

Revised Framework for ETL Workflow Management for Efficient Business Decision-Making

A schema aware ETL workflow generator

An MAS-based and fault-tolerant distributed ETL workflow engine

Data Warehouse Refreshment

Export Citation Format

etl workflowRecently Published Documents

TOTAL DOCUMENTS

H-INDEX

ETL Logs Under a Pattern-Oriented Approach

Data Migration using ETL Workflow

Parallelizing user–defined functions in the ETL workflow using orchestration style sheets

Towards a Cost Model to Optimize User-Defined Functions in an ETL Workflow Based on User-Defined Performance Metrics

ETL workflow reparation by means of case-based reasoning

ETL Workflow Generation for Offloading Dormant Data from the Data Warehouse to Hadoop

Revised Framework for ETL Workflow Management for Efficient Business Decision-Making

A schema aware ETL workflow generator

An MAS-based and fault-tolerant distributed ETL workflow engine

Data Warehouse Refreshment

etl workflow
Recently Published Documents