scholarly journals Distributed in-memory data management for workflow executions

2021 ◽  
Vol 7 ◽  
pp. e527
Author(s):  
Renan Souza ◽  
Vitor Silva ◽  
Alexandre A. B. Lima ◽  
Daniel de Oliveira ◽  
Patrick Valduriez ◽  
...  

Complex scientific experiments from various domains are typically modeled as workflows and executed on large-scale machines using a Parallel Workflow Management System (WMS). Since such executions usually last for hours or days, some WMSs provide user steering support, i.e., they allow users to run data analyses and, depending on the results, adapt the workflows at runtime. A challenge in the parallel execution control design is to manage workflow data for efficient executions while enabling user steering support. Data access for high scalability is typically transaction-oriented, while for data analysis, it is online analytical-oriented so that managing such hybrid workloads makes the challenge even harder. In this work, we present SchalaDB, an architecture with a set of design principles and techniques based on distributed in-memory data management for efficient workflow execution control and user steering. We propose a distributed data design for scalable workflow task scheduling and high availability driven by a parallel and distributed in-memory DBMS. To evaluate our proposal, we develop d-Chiron, a WMS designed according to SchalaDB’s principles. We carry out an extensive experimental evaluation on an HPC cluster with up to 960 computing cores. Among other analyses, we show that even when running data analyses for user steering, SchalaDB’s overhead is negligible for workloads composed of hundreds of concurrent tasks on shared data. Our results encourage workflow engine developers to follow a parallel and distributed data-oriented approach not only for scheduling and monitoring but also for user steering.

Author(s):  
Alexandra Carpen-Amarie ◽  
Alexandru Costan ◽  
Jing Cai ◽  
Gabriel Antoniu ◽  
Luc Bougé

Bringing introspection into BlobSeer: Towards a self-adaptive distributed data management system Introspection is the prerequisite of autonomic behavior, the first step towards performance improvement and resource usage optimization for large-scale distributed systems. In grid environments, the task of observing the application behavior is assigned to monitoring systems. However, most of them are designed to provide general resource information and do not consider specific information for higher-level services. More precisely, in the context of data-intensive applications, a specific introspection layer is required to collect data about the usage of storage resources, data access patterns, etc. This paper discusses the requirements for an introspection layer in a data management system for large-scale distributed infrastructures. We focus on the case of BlobSeer, a large-scale distributed system for storing massive data. The paper explains why and how to enhance BlobSeer with introspective capabilities and proposes a three-layered architecture relying on the MonALISA monitoring framework. We illustrate the autonomic behavior of BlobSeer with a self-configuration component aiming to provide storage elasticity by dynamically scaling the number of data providers. Then we propose a preliminary approach for enabling self-protection for the BlobSeer system, through a malicious client detection component. The introspective architecture has been evaluated on the Grid'5000 testbed, with experiments that prove the feasibility of generating relevant information related to the state and behavior of the system.


Author(s):  
Ewa Deelman ◽  
Ann Chervenak

Scientific applications such as those in astronomy, earthquake science, gravitational-wave physics, and others have embraced workflow technologies to do large-scale science. Workflows enable researchers to collaboratively design, manage, and obtain results that involve hundreds of thousands of steps, access terabytes of data, and generate similar amounts of intermediate and final data products. Although workflow systems are able to facilitate the automated generation of data products, many issues still remain to be addressed. These issues exist in different forms in the workflow lifecycle. This chapter describes a workflow lifecycle as consisting of a workflow generation phase where the analysis is defined, the workflow planning phase where resources needed for execution are selected, the workflow execution part, where the actual computations take place, and the result, metadata, and provenance storing phase. The authors discuss the issues related to data management at each step of the workflow cycle. They describe challenge problems and illustrate them in the context of real-life applications. They discuss the challenges, possible solutions, and open issues faced when mapping and executing large-scale workflows on current cyberinfrastructure. They particularly emphasize the issues related to the management of data throughout the workflow lifecycle.


2003 ◽  
Vol 12 (04) ◽  
pp. 411-440 ◽  
Author(s):  
Roberto Silveira Silva Filho ◽  
Jacques Wainer ◽  
Edmundo R. M. Madeira

Standard client-server workflow management systems are usually designed as client-server systems. The central server is responsible for the coordination of the workflow execution and, in some cases, may manage the activities database. This centralized control architecture may represent a single point of failure, which compromises the availability of the system. We propose a fully distributed and configurable architecture for workflow management systems. It is based on the idea that the activities of a case (an instance of the process) migrate from host to host, executing the workflow tasks, following a process plan. This core architecture is improved with the addition of other distributed components so that other requirements for Workflow Management Systems, besides scalability, are also addressed. The components of the architecture were tested in different distributed and centralized configurations. The ability to configure the location of components and the use of dynamic allocation of tasks were effective for the implementation of load balancing policies.


Sensors ◽  
2018 ◽  
Vol 18 (8) ◽  
pp. 2611 ◽  
Author(s):  
Theofanis Raptis ◽  
Andrea Passarella ◽  
Marco Conti

Maintaining critical data access latency requirements is an important challenge of Industry 4.0. The traditional, centralized industrial networks, which transfer the data to a central network controller prior to delivery, might be incapable of meeting such strict requirements. In this paper, we exploit distributed data management to overcome this issue. Given a set of data, the set of consumer nodes and the maximum access latency that consumers can tolerate, we consider a method for identifying and selecting a limited set of proxies in the network where data needed by the consumer nodes can be cached. The method targets at balancing two requirements; data access latency within the given constraints and low numbers of selected proxies. We implement the method and evaluate its performance using a network of WSN430 IEEE 802.15.4-enabled open nodes. Additionally, we validate a simulation model and use it for performance evaluation in larger scales and more general topologies. We demonstrate that the proposed method (i) guarantees average access latency below the given threshold and (ii) outperforms traditional centralized and even distributed approaches.


Sign in / Sign up

Export Citation Format

Share Document