Ensuring scientific reproducibility in bio-macromolecular modeling via extensive, automated benchmarks

AbstractEach year vast international resources are wasted on irreproducible research. The scientific community has been slow to adopt standard software engineering practices, despite the increases in high-dimensional data, complexities of workflows, and computational environments. Here we show how scientific software applications can be created in a reproducible manner when simple design goals for reproducibility are met. We describe the implementation of a test server framework and 40 scientific benchmarks, covering numerous applications in Rosetta bio-macromolecular modeling. High performance computing cluster integration allows these benchmarks to run continuously and automatically. Detailed protocol captures are useful for developers and users of Rosetta and other macromolecular modeling tools. The framework and design concepts presented here are valuable for developers and users of any type of scientific software and for the scientific community to create reproducible methods. Specific examples highlight the utility of this framework, and the comprehensive documentation illustrates the ease of adding new tests in a matter of hours.

Download Full-text

Ensuring scientific reproducibility in bio-macromolecular modeling via extensive, automated benchmarks

10.1101/2021.04.04.438423 ◽

2021 ◽

Cited By ~ 1

Author(s):

Julia Koehler Leman ◽

Sergey Lyskov ◽

Steven Lewis ◽

Jared Adolf-Bryfogle ◽

Rebecca F. Alford ◽

...

Keyword(s):

Scientific Community ◽

High Performance ◽

Scientific Software ◽

Modeling Tools ◽

Design Concepts ◽

Software Applications ◽

Engineering Practices ◽

Macromolecular Modeling ◽

High Performance Computing Cluster ◽

Reproducible Manner

AbstractEach year vast international resources are wasted on irreproducible research. The scientific community has been slow to adopt standard software engineering practices, despite the increases in high-dimensional data, complexities of workflows, and computational environments. Here we show how scientific software applications can be created in a reproducible manner when simple design goals for reproducibility are met. We describe the implementation of a test server framework and 40 scientific benchmarks, covering numerous applications in Rosetta bio-macromolecular modeling. High performance computing cluster integration allows these benchmarks to run continuously and automatically. Detailed protocol captures are useful for developers and users of Rosetta and other macromolecular modeling tools. The framework and design concepts presented here are valuable for developers and users of any type of scientific software and for the scientific community to create reproducible methods. Specific examples highlight the utility of this framework and the comprehensive documentation illustrates the ease of adding new tests in a matter of hours.

Download Full-text

Managing Scientific Software Complexity with Bocca and CCA

Scientific Programming ◽

10.1155/2008/417946 ◽

2008 ◽

Vol 16 (4) ◽

pp. 315-327 ◽

Cited By ~ 2

Author(s):

Benjamin A. Allan ◽

Boyana Norris ◽

Wael R. Elwasif ◽

Robert C. Armstrong

Keyword(s):

High Performance ◽

Web Applications ◽

Scientific Software ◽

Common Component ◽

Software Complexity ◽

Common Component Architecture ◽

Engineering Practices ◽

Short Time ◽

Application Developers ◽

Glue Code

In high-performance scientific software development, the emphasis is often on short time to first solution. Even when the development of new components mostly reuses existing components or libraries and only small amounts of new code must be created, dealing with the component glue code and software build processes to obtain complete applications is still tedious and error-prone. Component-based software meant to reduce complexity at the application level increases complexity to the extent that the user must learn and remember the interfaces and conventions of the component model itself. To address these needs, we introduce Bocca, the first tool to enable application developers to perform rapid component prototyping while maintaining robust software-engineering practices suitable to HPC environments. Bocca provides project management and a comprehensive build environment for creating and managing applications composed of Common Component Architecture components. Of critical importance for high-performance computing (HPC) applications, Bocca is designed to operate in a language-agnostic way, simultaneously handling components written in any of the languages commonly used in scientific applications: C, C++, Fortran, Python and Java. Bocca automates the tasks related to the component glue code, freeing the user to focus on the scientific aspects of the application. Bocca embraces the philosophy pioneered by Ruby on Rails for web applications: start with something that works, and evolve it to the user's purpose.

Download Full-text

Rapidly Reconfigurable High Performance Computing Cluster

10.21236/ada438586 ◽

2005 ◽

Author(s):

Mark A. Richards ◽

Daniel P. Campbell

Keyword(s):

High Performance Computing ◽

High Performance ◽

High Performance Computing Cluster ◽

Performance Computing ◽

Computing Cluster

Download Full-text

Spatial and temporal tools for building a human cell atlas

Molecular Biology of the Cell ◽

10.1091/mbc.e18-10-0667 ◽

2019 ◽

Vol 30 (19) ◽

pp. 2435-2438 ◽

Cited By ~ 2

Author(s):

Jonah Cool ◽

Richard S. Conroy ◽

Sean E. Hanlon ◽

Shannon K. Hughes ◽

Ananda L. Roy

Keyword(s):

Scientific Community ◽

Large Data ◽

Large Data Sets ◽

Data Sets ◽

Experimental Conditions ◽

Modeling Tools ◽

Holistic View ◽

Multiple Timescales ◽

Single Cell Sequencing ◽

Coordination And Collaboration

Improvements in the sensitivity, content, and throughput of microscopy, in the depth and throughput of single-cell sequencing approaches, and in computational and modeling tools for data integration have created a portfolio of methods for building spatiotemporal cell atlases. Challenges in this fast-moving field include optimizing experimental conditions to allow a holistic view of tissues, extending molecular analysis across multiple timescales, and developing new tools for 1) managing large data sets, 2) extracting patterns and correlation from these data, and 3) integrating and visualizing data and derived results in an informative way. The utility of these tools and atlases for the broader scientific community will be accelerated through a commitment to findable, accessible, interoperable, and reusable data and tool sharing principles that can be facilitated through coordination and collaboration between programs working in this space.

Download Full-text

A Review on the Development of Rotman Lens Antenna

Chinese Journal of Engineering ◽

10.1155/2014/385385 ◽

2014 ◽

Vol 2014 ◽

pp. 1-9 ◽

Cited By ~ 9

Author(s):

Shruti Vashist ◽

M. K. Soni ◽

P. K. Singhal

Keyword(s):

High Performance ◽

Beam Steering ◽

Phase Error ◽

Antenna System ◽

Surveillance Systems ◽

Electronic Warfare ◽

Design Concepts ◽

Low Profile ◽

True Time ◽

The One

Rotman lenses are the beguiling devices used by the beamforming networks (BFNs). These lenses are generally used in the radar surveillance systems to see targets in multiple directions due to its multibeam capability without physically moving the antenna system. Now a days these lenses are being integrated into many radars and electronic warfare systems around the world. The antenna should be capable of producing multiple beams which can be steered without changing the orientation of the antenna. Microwave lenses are the one who support low-phase error, wideband, and wide-angle scanning. They are the true time delay (TTD) devices producing frequency independent beam steering. The emerging printed lenses in recent years have facilitated the advancement of designing high performance but low-profile, light-weight, and small-size and networks (BFNs). This paper will review and analyze various design concepts used over the years to improve the scanning capability of the lens developed by various researchers.

Download Full-text

Fast Playback Framework for Analysis of Ground-Based Doppler Radar Observations Using MapReduce Technology

Journal of Atmospheric and Oceanic Technology ◽

10.1175/jtech-d-15-0118.1 ◽

2016 ◽

Vol 33 (4) ◽

pp. 621-634 ◽

Cited By ~ 4

Author(s):

Jingyin Tang ◽

Corene J. Matyas

Keyword(s):

Spatial Analysis ◽

High Performance ◽

Large Scale ◽

Doppler Radar ◽

Weather Events ◽

Data Architecture ◽

Research Grade ◽

High Performance Computing Cluster ◽

Time Systems ◽

Performance Computing

AbstractThe creation of a 3D mosaic is often the first step when using the high-spatial- and temporal-resolution data produced by ground-based radars. Efficient yet accurate methods are needed to mosaic data from dozens of radar to better understand the precipitation processes in synoptic-scale systems such as tropical cyclones. Research-grade radar mosaic methods of analyzing historical weather events should utilize data from both sides of a moving temporal window and process them in a flexible data architecture that is not available in most stand-alone software tools or real-time systems. Thus, these historical analyses require a different strategy for optimizing flexibility and scalability by removing time constraints from the design. This paper presents a MapReduce-based playback framework using Apache Spark’s computational engine to interpolate large volumes of radar reflectivity and velocity data onto 3D grids. Designed as being friendly to use on a high-performance computing cluster, these methods may also be executed on a low-end configured machine. A protocol is designed to enable interoperability with GIS and spatial analysis functions in this framework. Open-source software is utilized to enhance radar usability in the nonspecialist community. Case studies during a tropical cyclone landfall shows this framework’s capability of efficiently creating a large-scale high-resolution 3D radar mosaic with the integration of GIS functions for spatial analysis.

Download Full-text

VINCI®, the european reference for ariane 6 upper stage cryogenic propulsive system

Progress in Propulsion Physics – Volume 11 ◽

10.1051/eucass/201911481 ◽

2019 ◽

Author(s):

P. Alliot ◽

J.-F. Delange ◽

V. De Korver ◽

J.-M. Sannino ◽

A. Lekeux ◽

...

Keyword(s):

High Performance ◽

Design Review ◽

Design Concepts ◽

Critical Design ◽

Potential Applications ◽

Upper Stage ◽

Command Systems ◽

Engine Development ◽

Orbital Spacecraft ◽

Government Level

The intent of this publication is to provide an overview of the development of the VINCI® engine over the period 2014–2015. The VINCI® engine is an upper stage, cryogenic expander cycle engine. It combines the required features of this cycle, i. e., high performance chamber cooling and high performance hydrogen turbopump, with proven design concepts based on the accumulated experience from previous European cryogenic engines such as the HM7 and the VULCAIN®. In addition, its high performance and reliability, its restart and throttle capability offer potential applications on various future launcher upper stages as well as orbital spacecraft. At the end of 2014, the VINCI® successfully passed the Critical Design Review that was held after the major subsystem (combustion chamber, fuel and oxygen turbopump) had passed their own Critical Design Review all along the second half of 2014. In December, a Ministerial Conference at government level gave priority to the Ariane 6 program as Europe future launcher. In the framework of this decision, VINCI® was confirmed as the engine to equip Ariane 6 cryogenic upper stage engine. This publication shows how the VINCI development is progressing toward qualification, and also how the requirements of the new Ariane 6 configuration taken into account, i. e., offering new opportunities to the launch system and managing the new constraints. Moreover, the authors capitalize on the development already achieved for the evolution of Ariane 5. In parallel to completing the engine development and qualification, the configuration and the equipment of the propulsive system for Ariane 6 such as the components of the pressurization and helium command systems, board to ground coupling equipment, are being defined.

Download Full-text

Buffer Placement and Sizing for High-Performance Dataflow Circuits

ACM Transactions on Reconfigurable Technology and Systems ◽

10.1145/3477053 ◽

2022 ◽

Vol 15 (1) ◽

pp. 1-32

Author(s):

Lana Josipović ◽

Shabnam Sheikhha ◽

Andrea Guerrieri ◽

Paolo Ienne ◽

Jordi Cortadella

Keyword(s):

Performance Optimization ◽

Optimization Model ◽

High Performance ◽

Control Flow ◽

High Level Synthesis ◽

Software Applications ◽

Marked Graphs ◽

Variable Latency ◽

High Level ◽

Strong Contrast

Commercial high-level synthesis tools typically produce statically scheduled circuits. Yet, effective C-to-circuit conversion of arbitrary software applications calls for dataflow circuits, as they can handle efficiently variable latencies (e.g., caches), unpredictable memory dependencies, and irregular control flow. Dataflow circuits exhibit an unconventional property: registers (usually referred to as “buffers”) can be placed anywhere in the circuit without changing its semantics, in strong contrast to what happens in traditional datapaths. Yet, although functionally irrelevant, this placement has a significant impact on the circuit’s timing and throughput. In this work, we show how to strategically place buffers into a dataflow circuit to optimize its performance. Our approach extracts a set of choice-free critical loops from arbitrary dataflow circuits and relies on the theory of marked graphs to optimize the buffer placement and sizing. Our performance optimization model supports important high-level synthesis features such as pipelined computational units, units with variable latency and throughput, and if-conversion. We demonstrate the performance benefits of our approach on a set of dataflow circuits obtained from imperative code.

Download Full-text

META-pipe cloud setup and execution

F1000Research ◽

10.12688/f1000research.13204.1 ◽

2017 ◽

Vol 6 ◽

pp. 2060

Author(s):

Aleksandr Agafonov ◽

Kimmo Mattila ◽

Cuong Duong Tuan ◽

Lars Tiede ◽

Inge Alexander Raknes ◽

...

Keyword(s):

Functional Annotation ◽

High Performance ◽

Sequence Data ◽

Metagenomic Data ◽

Taxonomic Profiling ◽

Geographically Distributed ◽

Computationally Intensive ◽

High Performance Computing Cluster ◽

And Storage ◽

Performance Computing

META-pipe is a complete service for the analysis of marine metagenomic data. It provides assembly of high-throughput sequence data, functional annotation of predicted genes, and taxonomic profiling. The functional annotation is computationally demanding and is therefore currently run on a high-performance computing cluster in Norway. However, additional compute resources are necessary to open the service to all ELIXIR users. We describe our approach for setting up and executing the functional analysis of META-pipe on additional academic and commercial clouds. Our goal is to provide a powerful analysis service that is easy to use and to maintain. Our design therefore uses a distributed architecture where we combine central servers with multiple distributed backends that execute the computationally intensive jobs. We believe our experiences developing and operating META-pipe provides a useful model for others that plan to provide a portal based data analysis service in ELIXIR and other organizations with geographically distributed compute and storage resources.

Download Full-text