workflow execution
Recently Published Documents


TOTAL DOCUMENTS

192
(FIVE YEARS 30)

H-INDEX

15
(FIVE YEARS 1)

2022 ◽  
Vol 14 (1) ◽  
pp. 1-27
Author(s):  
Khalid Belhajjame

Workflows have been adopted in several scientific fields as a tool for the specification and execution of scientific experiments. In addition to automating the execution of experiments, workflow systems often include capabilities to record provenance information, which contains, among other things, data records used and generated by the workflow as a whole but also by its component modules. It is widely recognized that provenance information can be useful for the interpretation, verification, and re-use of workflow results, justifying its sharing and publication among scientists. However, workflow execution in some branches of science can manipulate sensitive datasets that contain information about individuals. To address this problem, we investigate, in this article, the problem of anonymizing the provenance of workflows. In doing so, we consider a popular class of workflows in which component modules use and generate collections of data records as a result of their invocation, as opposed to a single data record. The solution we propose offers guarantees of confidentiality without compromising lineage information, which provides transparency as to the relationships between the data records used and generated by the workflow modules. We provide algorithmic solutions that show how the provenance of a single module and an entire workflow can be anonymized and present the results of experiments that we conducted for their evaluation.


Author(s):  
Alberto Scionti ◽  
Paolo Viviani ◽  
Giacomo Vitali ◽  
Chiara Vercellino ◽  
Olivier Terzo ◽  
...  

Author(s):  
Martin Golasowski ◽  
Jan Martinovič ◽  
Jan Křenek ◽  
Kateřina Slaninová ◽  
Marc Levrier ◽  
...  

2021 ◽  
Vol 2021 ◽  
pp. 1-15
Author(s):  
Monsef Boughrous ◽  
Hanan El Bakkali

Workflow management systems are very important for any organization to manage and model complex business processes. However, significant work is needed to keep a workflow resilient and secure. Therefore, organizations apply a strict security policy and enforce access control constraints. As a result, the number of available and authorized users for the workflow execution decreases drastically. Thus, in many cases, such a situation leads to a workflow deadlock situation, where there no available authorized user-task assignments for critical tasks to accomplish the workflow execution. In the literature, this problem has gained interest of security researchers in the recent years, and is known as the workflow satisfiability problem (WSP). In this paper, we propose a new approach to bypass the WSP and to ensure workflow resiliency and security. For this purpose, we define workflow criticality, which can be used as a metric during run-time to prevent WSP. We believe that the workflow criticality value will help workflow managers to make decisions and start a mitigation solution in case of a critical workflow. Moreover, we propose a delegation process algorithm (DP) as a mitigation solution that uses workflow instance criticality, delegation, and priority concepts to find authorized and suitable users to perform the critical task with low-security risks.


2021 ◽  
Author(s):  
Peter van Heusden ◽  
Ziphozahe Mashologu ◽  
Thoba Lose ◽  
Robin Warren ◽  
Alan Christoffels

Whole Genome Sequencing (WGS) is a powerful method for detecting drug resistance, genetic diversity and transmission dynamics of Mycobacterium tuberculosis. Implementation of WGS in public health microbiology laboratories is impeded by a lack of user-friendly, automated and semi-automated pipelines. We present the COMBAT-TB workbench, a modular, easy to install application that provides a web based environment for Mycobacterium tuberculosis bioinformatics. The COMBAT-TB Workbench is built using two main software components: the IRIDA Platform for its web-based user interface and data management capabilities and the Galaxy bioinformatics workflow platform for workflow execution. These components are combined into a single easy to install application using Docker container technology. We implemented two workflows, for M. tuberculosis sample analysis and phylogeny, in Galaxy. Building our workflows involved updating some Galaxy tools (Trimmomatic, snippy and snp-sites) and writing new Galaxy tools (snp-dists, TB-Profiler, tb_variant_filter and TB Variant Report). The irida-wf-ga2xml tool was updated to be able to work with recent versions of Galaxy and was further developed into IRIDA plugins for both workflows. In the case of the M. tuberculosis sample analysis an interface was added to update the metadata stored for each sequence sample with results gleaned from the Galaxy workflow output. Data can be loaded into the COMBAT-TB Workbench via the web interface or via the command line IRIDA uploader tool. The COMBAT-TB Workbench application deploys IRIDA, the COMBAT-TB IRIDA plugins, the MariaDB database and Galaxy using Docker containers (https://github.com/COMBAT-TB/irida-galaxy-deploy).


2021 ◽  
Vol 7 ◽  
pp. e527
Author(s):  
Renan Souza ◽  
Vitor Silva ◽  
Alexandre A. B. Lima ◽  
Daniel de Oliveira ◽  
Patrick Valduriez ◽  
...  

Complex scientific experiments from various domains are typically modeled as workflows and executed on large-scale machines using a Parallel Workflow Management System (WMS). Since such executions usually last for hours or days, some WMSs provide user steering support, i.e., they allow users to run data analyses and, depending on the results, adapt the workflows at runtime. A challenge in the parallel execution control design is to manage workflow data for efficient executions while enabling user steering support. Data access for high scalability is typically transaction-oriented, while for data analysis, it is online analytical-oriented so that managing such hybrid workloads makes the challenge even harder. In this work, we present SchalaDB, an architecture with a set of design principles and techniques based on distributed in-memory data management for efficient workflow execution control and user steering. We propose a distributed data design for scalable workflow task scheduling and high availability driven by a parallel and distributed in-memory DBMS. To evaluate our proposal, we develop d-Chiron, a WMS designed according to SchalaDB’s principles. We carry out an extensive experimental evaluation on an HPC cluster with up to 960 computing cores. Among other analyses, we show that even when running data analyses for user steering, SchalaDB’s overhead is negligible for workloads composed of hundreds of concurrent tasks on shared data. Our results encourage workflow engine developers to follow a parallel and distributed data-oriented approach not only for scheduling and monitoring but also for user steering.


2021 ◽  
Vol 2021 ◽  
pp. 1-11
Author(s):  
Donglai Fu ◽  
Yanhua Liu

Fairness plays a vital role in crowd computing by attracting its workers. The power of crowd computing stems from a large number of workers potentially available to provide high quality of service and reduce costs. An important challenge in the crowdsourcing market today is the task allocation of crowdsourcing workflows. Requester-centric task allocation algorithms aim to maximize the completion quality of the entire workflow and minimize its total cost, which are discriminatory for workers. The crowdsourcing workflow needs to balance two objectives, namely, fairness and cost. In this study, we propose an alternative greedy approach with four heuristic strategies to address such an issue. In particular, the proposed approach aims to monitor the current status of workflow execution and use heuristic strategies to adjust the parameters of task allocation. We design a two-phase allocation model to accurately match the tasks with workers. The F-Aware allocates each task to the worker that maximizes the fairness and minimizes the cost. We conduct extensive experiments to quantitatively evaluate the proposed algorithms in terms of running time, fairness, and cost by using a customer objective function on the WorkflowSim, a well-known cloud simulation tool. Experimental results based on real-world workflows show that the F-Aware, which is 1% better than the best competitor algorithm, outperforms other optimal solutions in finding the tradeoff between fairness and cost.


Author(s):  
Michel Krämer ◽  
Hendrik M. Würz ◽  
Christian Altenhofen

AbstractWe present an algorithm and a software architecture for a cloud-based system that executes cyclic scientific workflows whose structure may change during run time. Existing approaches either rely on workflow definitions based on directed acyclic graphs (DAGs) or require workarounds to implement cyclic structures. In contrast, our system supports cycles natively, avoids workarounds, and as such reduces the complexity of workflow modelling and maintenance. Our algorithm traverses workflow graphs and transforms them iteratively into linear sequences of executable actions. We call these sequences process chains. Our software architecture distributes the process chains to multiple compute nodes in the cloud and oversees their execution. We evaluate our approach by applying it to two practical use cases from the domains of astronomy and engineering. We also compare it with two existing workflow management systems. The evaluation demonstrates that our algorithm is able to execute dynamically changing workflows with cycles and that design and maintenance of complex workflows is easier than with existing solutions. It also shows that our software architecture can run process chains on multiple compute nodes in parallel to significantly speed up the workflow execution. An implementation of our algorithm and the software architecture is available with the Steep Workflow Management System that we released under an open-source license. The resources for the first practical use case are also available as open source for reproduction.


Computing ◽  
2021 ◽  
Author(s):  
Henrique Y. Shishido ◽  
Júlio C. Estrella ◽  
Claudio F. M. Toledo ◽  
Stephan Reiff-Marganiec
Keyword(s):  

Sign in / Sign up

Export Citation Format

Share Document