Optimizing security and cost of workflow execution using task annotation and genetic-based algorithm

Workflows have been adopted in several scientific fields as a tool for the specification and execution of scientific experiments. In addition to automating the execution of experiments, workflow systems often include capabilities to record provenance information, which contains, among other things, data records used and generated by the workflow as a whole but also by its component modules. It is widely recognized that provenance information can be useful for the interpretation, verification, and re-use of workflow results, justifying its sharing and publication among scientists. However, workflow execution in some branches of science can manipulate sensitive datasets that contain information about individuals. To address this problem, we investigate, in this article, the problem of anonymizing the provenance of workflows. In doing so, we consider a popular class of workflows in which component modules use and generate collections of data records as a result of their invocation, as opposed to a single data record. The solution we propose offers guarantees of confidentiality without compromising lineage information, which provides transparency as to the relationships between the data records used and generated by the workflow modules. We provide algorithmic solutions that show how the provenance of a single module and an entire workflow can be anonymized and present the results of experiments that we conducted for their evaluation.

Download Full-text

Retry Scopes to Enable Robust Workflow Execution in Pervasive Environments

Service-Oriented Computing. ICSOC/ServiceWave 2009 Workshops - Lecture Notes in Computer Science ◽

10.1007/978-3-642-16132-2_34 ◽

2010 ◽

pp. 358-369 ◽

Cited By ~ 1

Author(s):

Hanna Eberle ◽

Oliver Kopp ◽

Tobias Unger ◽

Frank Leymann

Keyword(s):

Workflow Execution ◽

Pervasive Environments

Download Full-text

Replication Heuristics for Efficient Workflow Execution on Grids

On the Move to Meaningful Internet Systems 2007: OTM 2007 Workshops - Lecture Notes in Computer Science ◽

10.1007/978-3-540-76888-3_16 ◽

2007 ◽

pp. 31-32

Author(s):

J. L. Vázquez-Poletti ◽

E. Huedo ◽

R. S. Montero ◽

I. M. Llorente

Keyword(s):

Workflow Execution

Download Full-text

A Simulation Study of Job Workflow Execution Models over the Grid

Grid and Cooperative Computing - Lecture Notes in Computer Science ◽

10.1007/978-3-540-24680-0_148 ◽

2004 ◽

pp. 935-943 ◽

Cited By ~ 2

Author(s):

Yuhong Feng ◽

Wentong Cai ◽

Jiannong Cao

Keyword(s):

Simulation Study ◽

Workflow Execution ◽

Execution Models

Download Full-text

Dynamic Compatibility Matching of Services for Distributed Workflow Execution

Parallel Processing and Applied Mathematics - Lecture Notes in Computer Science ◽

10.1007/978-3-642-31500-8_16 ◽

2012 ◽

pp. 151-160

Author(s):

Paweł Czarnul ◽

Michał Wójcik

Keyword(s):

Distributed Workflow ◽

Workflow Execution

Download Full-text

Event-Driven Scientific Workflow Execution

Business Process Management Workshops - Lecture Notes in Business Information Processing ◽

10.1007/978-3-642-36285-9_42 ◽

2013 ◽

pp. 390-401 ◽

Cited By ~ 1

Author(s):

Zhili Zhao ◽

Adrian Paschke

Keyword(s):

Scientific Workflow ◽

Workflow Execution ◽

Event Driven

Download Full-text

Data Management in Scientific Workflows

Advances in Systems Analysis, Software Engineering, and High Performance Computing - Data Intensive Distributed Computing ◽

10.4018/978-1-61520-971-2.ch008 ◽

2012 ◽

pp. 177-187

Author(s):

Ewa Deelman ◽

Ann Chervenak

Keyword(s):

Data Management ◽

Gravitational Wave ◽

Large Scale ◽

Real Life ◽

Scientific Workflows ◽

Workflow Systems ◽

Automated Generation ◽

Workflow Execution ◽

Data Products ◽

Workflow Planning

Scientific applications such as those in astronomy, earthquake science, gravitational-wave physics, and others have embraced workflow technologies to do large-scale science. Workflows enable researchers to collaboratively design, manage, and obtain results that involve hundreds of thousands of steps, access terabytes of data, and generate similar amounts of intermediate and final data products. Although workflow systems are able to facilitate the automated generation of data products, many issues still remain to be addressed. These issues exist in different forms in the workflow lifecycle. This chapter describes a workflow lifecycle as consisting of a workflow generation phase where the analysis is defined, the workflow planning phase where resources needed for execution are selected, the workflow execution part, where the actual computations take place, and the result, metadata, and provenance storing phase. The authors discuss the issues related to data management at each step of the workflow cycle. They describe challenge problems and illustrate them in the context of real-life applications. They discuss the challenges, possible solutions, and open issues faced when mapping and executing large-scale workflows on current cyberinfrastructure. They particularly emphasize the issues related to the management of data throughout the workflow lifecycle.

Download Full-text