Optimizing security and cost of workflow execution using task annotation and genetic-based algorithm

Computing ◽  
2021 ◽  
Author(s):  
Henrique Y. Shishido ◽  
Júlio C. Estrella ◽  
Claudio F. M. Toledo ◽  
Stephan Reiff-Marganiec
Keyword(s):  
2011 ◽  
Vol 55 (13) ◽  
pp. 2954-2975 ◽  
Author(s):  
Juliano Araujo Wickboldt ◽  
Luís Armando Bianchin ◽  
Roben Castagna Lunardi ◽  
Lisandro Zambenedetti Granville ◽  
Luciano Paschoal Gaspary ◽  
...  

Author(s):  
Augusto Morales ◽  
Jose-Ignacio Castillo-Velazquez ◽  
Borja Bordel ◽  
Miguel Hernandez

2022 ◽  
Vol 14 (1) ◽  
pp. 1-27
Author(s):  
Khalid Belhajjame

Workflows have been adopted in several scientific fields as a tool for the specification and execution of scientific experiments. In addition to automating the execution of experiments, workflow systems often include capabilities to record provenance information, which contains, among other things, data records used and generated by the workflow as a whole but also by its component modules. It is widely recognized that provenance information can be useful for the interpretation, verification, and re-use of workflow results, justifying its sharing and publication among scientists. However, workflow execution in some branches of science can manipulate sensitive datasets that contain information about individuals. To address this problem, we investigate, in this article, the problem of anonymizing the provenance of workflows. In doing so, we consider a popular class of workflows in which component modules use and generate collections of data records as a result of their invocation, as opposed to a single data record. The solution we propose offers guarantees of confidentiality without compromising lineage information, which provides transparency as to the relationships between the data records used and generated by the workflow modules. We provide algorithmic solutions that show how the provenance of a single module and an entire workflow can be anonymized and present the results of experiments that we conducted for their evaluation.


Author(s):  
Ewa Deelman ◽  
Ann Chervenak

Scientific applications such as those in astronomy, earthquake science, gravitational-wave physics, and others have embraced workflow technologies to do large-scale science. Workflows enable researchers to collaboratively design, manage, and obtain results that involve hundreds of thousands of steps, access terabytes of data, and generate similar amounts of intermediate and final data products. Although workflow systems are able to facilitate the automated generation of data products, many issues still remain to be addressed. These issues exist in different forms in the workflow lifecycle. This chapter describes a workflow lifecycle as consisting of a workflow generation phase where the analysis is defined, the workflow planning phase where resources needed for execution are selected, the workflow execution part, where the actual computations take place, and the result, metadata, and provenance storing phase. The authors discuss the issues related to data management at each step of the workflow cycle. They describe challenge problems and illustrate them in the context of real-life applications. They discuss the challenges, possible solutions, and open issues faced when mapping and executing large-scale workflows on current cyberinfrastructure. They particularly emphasize the issues related to the management of data throughout the workflow lifecycle.


Sign in / Sign up

Export Citation Format

Share Document