BioTracs: A transversal framework for computational workflow standardization and traceability
Background: The need of digital tools for integrative analysis is today important in most scientific areas. It leads to several community-driven initiatives to standardize the sharing of data and computational workflows. However, there exists no open agnostic framework to model and implement computation workflows, in particular in bioinformatics. It is therefore difficult for data scientists to share transparently and integrate heterogeneous analysis processes coming from different scientific domains, programing languages, projects or teams. Results: We present here BioTracs, a transversal framework for computational workflow standardization and traceability. It is based on PRISM architecture (Process Resource Interfacing SysteM), an agnostic open architecture we introduce here to standardize the way processes and resources can be modelled and interfaced in computational workflows to ensure traceability, reproducibility and facilitate sharing. BioTracs is today implemented in MATLAB and available under open source license on GitHub. Several BioTracs-derived applications are also available online. They were successfully applied to large-scale metabolomics and clinical studies and demonstrated flexibility and robustness. Conclusions: As an implementation of the PRISM architecture, BioTracs paved the way to an open framework in which bioinformatics could specify ad model workflows. PRISM architecture is designed to provide scalability and transparency from the code to the project level we less efforts. It could also be implemented using open object-oriented languages such as Python, C++ or java.