scholarly journals Design considerations for workflow management systems use in production genomics research and the clinic

2021 ◽  
Vol 11 (1) ◽  
Author(s):  
Azza E. Ahmed ◽  
Joshua M. Allen ◽  
Tajesvi Bhat ◽  
Prakruthi Burra ◽  
Christina E. Fliege ◽  
...  

AbstractThe changing landscape of genomics research and clinical practice has created a need for computational pipelines capable of efficiently orchestrating complex analysis stages while handling large volumes of data across heterogeneous computational environments. Workflow Management Systems (WfMSs) are the software components employed to fill this gap. This work provides an approach and systematic evaluation of key features of popular bioinformatics WfMSs in use today: Nextflow, CWL, and WDL and some of their executors, along with Swift/T, a workflow manager commonly used in high-scale physics applications. We employed two use cases: a variant-calling genomic pipeline and a scalability-testing framework, where both were run locally, on an HPC cluster, and in the cloud. This allowed for evaluation of those four WfMSs in terms of language expressiveness, modularity, scalability, robustness, reproducibility, interoperability, ease of development, along with adoption and usage in research labs and healthcare settings. This article is trying to answer, which WfMS should be chosen for a given bioinformatics application regardless of analysis type?. The choice of a given WfMS is a function of both its intrinsic language and engine features. Within bioinformatics, where analysts are a mix of dry and wet lab scientists, the choice is also governed by collaborations and adoption within large consortia and technical support provided by the WfMS team/community. As the community and its needs continue to evolve along with computational infrastructure, WfMSs will also evolve, especially those with permissive licenses that allow commercial use. In much the same way as the dataflow paradigm and containerization are now well understood to be very useful in bioinformatics applications, we will continue to see innovations of tools and utilities for other purposes, like big data technologies, interoperability, and provenance.

2021 ◽  
Author(s):  
Azza E Ahmed ◽  
Joshua Allen ◽  
Tajesvi Bhat ◽  
Prakruthi Burra ◽  
Christina E Fliege ◽  
...  

Background: The changing landscape of genomics research and clinical practice has created a need for computational pipelines capable of efficiently orchestrating complex analysis stages while handling large volumes of data across heterogeneous computational environments. Workflow Management Systems (WfMSs) are the software components employed to fill this gap. Results: This work provides an approach and systematic evaluation of key features of popular bioinformatics WfMSs in use today: Nextflow, CWL, and WDL and some of their executors, along with Swift/T, a workflow manager commonly used in high-scale physics applications. We employed two use cases: a variant-calling genomic pipeline and a scalability-testing framework, where both were run locally, on an HPC cluster, and in the cloud. This allowed for evaluation of those four WfMSs in terms of language expressiveness, modularity, scalability, robustness, reproducibility, interoperability, ease of development, along with adoption and usage in research labs and healthcare settings. This article is trying to answer, "which WfMS should be chosen for a given bioinformatics application regardless of analysis type?". Conclusions: The choice of a given WfMS is a function of both its intrinsic language and engine features. Within bioinformatics, where analysts are a mix of dry and wet lab scientists, the choice is also governed by collaborations and adoption within large consortia and technical support provided by the WfMS team/community. As the community and its needs continue to evolve along with computational infrastructure, WfMSs will also evolve, especially those with permissive licenses that allow commercial use. In much the same way as the dataflow paradigm and containerization are now well understood to be very useful in bioinformatics applications, we will continue to see innovations of tools and utilities for other purposes, like big data technologies, interoperability, and provenance.


2012 ◽  
Vol 2 (4) ◽  
pp. 20-34
Author(s):  
Bob Chermin ◽  
Ingmar Frey ◽  
Hajo Reijers ◽  
Harm Smeets

Even though workflow management systems are currently not being applied on a wide scale in healthcare settings, their benefits with respect to operational efficiency and reducing patient risk seem enticing. The authors show how an approach that is rooted in simulation can be useful to predict the benefits of using a workflow management system. The approach is discussed and its application is demonstrated in the setting of the pre-operative process as being executed in the Bronovo hospital. The approach is considered useful for other healthcare organizations in search for a better foundation for the application of workflow technology.


2019 ◽  
Vol 16 (3) ◽  
Author(s):  
Jens Allmer

AbstractBig data and complex analysis workflows (pipelines) are common issues in data driven science such as bioinformatics. Large amounts of computational tools are available for data analysis. Additionally, many workflow management systems to piece together such tools into data analysis pipelines have been developed. For example, more than 50 computational tools for read mapping are available representing a large amount of duplicated effort. Furthermore, it is unclear whether these tools are correct and only a few have a user base large enough to have encountered and reported most of the potential problems. Bringing together many largely untested tools in a computational pipeline must lead to unpredictable results. Yet, this is the current state. While presently data analysis is performed on personal computers/workstations/clusters, the future will see development and analysis shift to the cloud. None of the workflow management systems is ready for this transition. This presents the opportunity to build a new system, which will overcome current duplications of effort, introduce proper testing, allow for development and analysis in public and private clouds, and include reporting features leading to interactive documents.


2013 ◽  
pp. 1155-1169
Author(s):  
Bob Chermin ◽  
Ingmar Frey ◽  
Hajo A. Reijers ◽  
Harm Smeets

Even though workflow management systems are currently not being applied on a wide scale in healthcare settings, their benefits with respect to operational efficiency and reducing patient risk seem enticing. The authors show how an approach that is rooted in simulation can be useful to predict the benefits of using a workflow management system. The approach is discussed and its application is demonstrated in the setting of the pre-operative process as being executed in the Bronovo hospital. The approach is considered useful for other healthcare organizations in search for a better foundation for the application of workflow technology.


Author(s):  
Tobias Käfer ◽  
Benjamin Jochum ◽  
Nico Aßfalg ◽  
Leonard Nürnberg

AbstractFor Read-Write Linked Data, an environment of reasoning and RESTful interaction, we investigate the use of the Guard-Stage-Milestone approach for specifying and executing user agents. We present an ontology to specify user agents. Moreover, we give operational semantics to the ontology in a rule language that allows for executing user agents on Read-Write Linked Data. We evaluate our approach formally and regarding performance. Our work shows that despite different assumptions of this environment in contrast to the traditional environment of workflow management systems, the Guard-Stage-Milestone approach can be transferred and successfully applied on the web of Read-Write Linked Data.


1998 ◽  
Author(s):  
Thomas Wendler ◽  
Kirsten Meetz ◽  
Joachim Schmidt

2014 ◽  
Vol 36 ◽  
pp. 352-362 ◽  
Author(s):  
Sonja Holl ◽  
Olav Zimmermann ◽  
Magnus Palmblad ◽  
Yassene Mohammed ◽  
Martin Hofmann-Apitius

Sign in / Sign up

Export Citation Format

Share Document