BEAGLE: An Application Programming Interface and High-Performance Computing Library for Statistical Phylogenetics

AbstractBackgroundMost biocomputing pipelines are run on clusters of computers. Each type of cluster has its own API (application programming interface). That API defines how a program that is to run on the cluster must request the submission, content and monitoring of jobs to be run on the cluster. Sometimes, it is desirable to run the same pipeline on different types of cluster. This can happen in situations including when:different labs are collaborating, but they do not use the same type of clustera pipeline is released to other labs as open source or commercial softwarea lab has access to multiple types of cluster, and wants to choose between them for scaling, cost or other purposesa lab is migrating their infrastructure from one cluster type to anotherduring testing or travelling, it is often desired to run on a single computerHowever, since each type of cluster has its own API, code that runs jobs on one type of cluster needs to be re-written if it is desired to run that application on a different type of cluster. To resolve this problem, we created a software module to generalize the submission of pipelines across computing environments, including local compute, clouds and clusters.ResultsHPCI (High Performance Computing Interface) is a Perl module that provides the interface to a standardized generic cluster.When the HPCI module is used, it accepts a parameter to specify the cluster type. The HPCI module uses this to load a driver HPCD∷<cluster>. This is used to translate the abstract HPCI interface to the specific software interface.Simply by changing the cluster parameter, the same pipeline can be run on a different type of cluster with no other changes.ConclusionThe HPCI module assists in writing Perl programs that can be run in different lab environments, with different site configuration requirements and different types of hardware clusters. Rather than having to re-write portions of the program, it is only necessary to change a configuration file.Using HPCI, an application can manage collections of jobs to be runs, specify ordering dependencies, detect success or failure of jobs run and allow automatic retry of failed jobs (allowing for the possibility of a changed configuration such as when the original attempt specified an inadequate memory allotment).

Download Full-text

ipyrad: Interactive assembly and analysis of RADseq datasets

Bioinformatics ◽

10.1093/bioinformatics/btz966 ◽

2020 ◽

Vol 36 (8) ◽

pp. 2592-2594 ◽

Cited By ~ 44

Author(s):

Deren A R Eaton ◽

Isaac Overcast

Keyword(s):

Open Source ◽

High Performance ◽

De Novo ◽

Application Programming Interface ◽

Analysis Tools ◽

Source Program ◽

Application Programming ◽

Downstream Analysis ◽

Performance Computing ◽

Python Package

Abstract Summary ipyrad is a free and open source tool for assembling and analyzing restriction site-associated DNA sequence datasets using de novo and/or reference-based approaches. It is designed to be massively scalable to hundreds of taxa and thousands of samples, and can be efficiently parallelized on high performance computing clusters. It is available both as a command line interface and as a Python package with an application programming interface, the latter of which can be used interactively to write complex, reproducible scripts and implement a suite of downstream analysis tools. Availability and implementation ipyrad is a free and open source program written in Python. Source code is available from the GitHub repository (https://github.com/dereneaton/ipyrad/), and Linux and MacOS installs are distributed through the conda package manager. Complete documentation, including numerous tutorials, and Jupyter notebooks demonstrating example assemblies and applications of downstream analysis tools are available online: https://ipyrad.readthedocs.io/.

Download Full-text

Dynamic reconfiguration of noniterative scientific applications: A case study with HPG aligner

The International Journal of High Performance Computing Applications ◽

10.1177/1094342018802347 ◽

2018 ◽

Vol 33 (5) ◽

pp. 804-816 ◽

Cited By ~ 1

Author(s):

Sergio Iserte ◽

Héctor Martínez ◽

Sergio Barrachina ◽

Maribel Castillo ◽

Rafael Mayo ◽

...

Keyword(s):

High Performance ◽

Application Programming Interface ◽

Scientific Applications ◽

Management Of Resources ◽

Application Programming ◽

High Performance Computer ◽

Malleable Jobs ◽

Programming Interface ◽

Process Level

Several studies have proved the benefits of job malleability, that is, the capacity of an application to adapt its parallelism to a dynamically changing number of allocated processors. The most remarkable advantages of executing malleable jobs as part of a high performance computer workload are the throughput increase and the more efficient utilization of the underlying resources. Malleability has been mostly applied to iterative applications where all the processes execute the same operations over different sets of data and with a balanced per process load. Unfortunately, not all scientific applications adhere to this process-level malleable job structure. There are scientific applications which are either noniterative or present an irregular per process load distribution. Unlike many other reconfiguration tools, the Dynamic Management of Resources Application Programming Interface (DMR API) provides the necessary flexibility to make malleable these out-of-target applications. In this article, we study the particular case of using the DMR API to generate a malleable version of HPG aligner, a distributed-memory noniterative genomic sequencer featuring an irregular communication pattern among processes. Through this first conversion of an out-of-target application to a malleable job, we both illustrate how the DMR API may be used to convert this type of applications into malleable and test the benefits of this conversion in production clusters. Our experimental results reveal an important reduction of the malleable HPG aligner jobs completion time compared to the original HPG aligner version. Furthermore, HPG aligner malleable workloads achieve a greater throughput than their fixed counterparts.

Download Full-text

Simulation of Multilayer Shallow Water Fluid Flow Using Lattice Boltzmann Modeling and High Performance Computing

World Environmental and Water Resources Congress 2009 ◽

10.1061/41036(342)282 ◽

2009 ◽

Author(s):

K. R. Tubbs ◽

F. T. -C. Tsai

Keyword(s):

Fluid Flow ◽

Shallow Water ◽

High Performance Computing ◽

Lattice Boltzmann ◽

High Performance ◽

Lattice Boltzmann Modeling ◽

Performance Computing

Download Full-text