high performance computing cluster
Recently Published Documents


TOTAL DOCUMENTS

73
(FIVE YEARS 32)

H-INDEX

5
(FIVE YEARS 3)

2021 ◽  
Vol 12 (1) ◽  
Author(s):  
Julia Koehler Leman ◽  
Sergey Lyskov ◽  
Steven M. Lewis ◽  
Jared Adolf-Bryfogle ◽  
Rebecca F. Alford ◽  
...  

AbstractEach year vast international resources are wasted on irreproducible research. The scientific community has been slow to adopt standard software engineering practices, despite the increases in high-dimensional data, complexities of workflows, and computational environments. Here we show how scientific software applications can be created in a reproducible manner when simple design goals for reproducibility are met. We describe the implementation of a test server framework and 40 scientific benchmarks, covering numerous applications in Rosetta bio-macromolecular modeling. High performance computing cluster integration allows these benchmarks to run continuously and automatically. Detailed protocol captures are useful for developers and users of Rosetta and other macromolecular modeling tools. The framework and design concepts presented here are valuable for developers and users of any type of scientific software and for the scientific community to create reproducible methods. Specific examples highlight the utility of this framework, and the comprehensive documentation illustrates the ease of adding new tests in a matter of hours.


2021 ◽  
Vol 11 (1) ◽  
Author(s):  
Devin O’Kelly ◽  
James Campbell ◽  
Jeni L. Gerberich ◽  
Paniz Karbasi ◽  
Venkat Malladi ◽  
...  

AbstractMultispectral photoacoustic tomography enables the resolution of spectral components of a tissue or sample at high spatiotemporal resolution. With the availability of commercial instruments, the acquisition of data using this modality has become consistent and standardized. However, the analysis of such data is often hampered by opaque processing algorithms, which are challenging to verify and validate from a user perspective. Furthermore, such tools are inflexible, often locking users into a restricted set of processing motifs, which may not be able to accommodate the demands of diverse experiments. To address these needs, we have developed a Reconstruction, Analysis, and Filtering Toolbox to support the analysis of photoacoustic imaging data. The toolbox includes several algorithms to improve the overall quantification of photoacoustic imaging, including non-negative constraints and multispectral filters. We demonstrate various use cases, including dynamic imaging challenges and quantification of drug effect, and describe the ability of the toolbox to be parallelized on a high performance computing cluster.


2021 ◽  
Vol 2021 ◽  
pp. 1-10
Author(s):  
Bingzheng Li ◽  
Jinchen Xu ◽  
Zijing Liu

With the development of high-performance computing and big data applications, the scale of data transmitted, stored, and processed by high-performance computing cluster systems is increasing explosively. Efficient compression of large-scale data and reducing the space required for data storage and transmission is one of the keys to improving the performance of high-performance computing cluster systems. In this paper, we present SW-LZMA, a parallel design and optimization of LZMA based on the Sunway 26010 heterogeneous many-core processor. Combined with the characteristics of SW26010 processors, we analyse the storage space requirements, memory access characteristics, and hotspot functions of the LZMA algorithm and implement the thread-level parallelism of the LZMA algorithm based on Athread interface. Furthermore, we make a fine-grained layout of LDM address space to achieve DMA double buffer cyclic sliding window algorithm, which optimizes the performance of SW-LZMA. The experimental results show that compared with the serial baseline implementation of LZMA, the parallel LZMA algorithm obtains a maximum speedup ratio of 4.1 times using the Silesia corpus benchmark, while on the large-scale data set, speedup is 5.3 times.


2021 ◽  
Author(s):  
Surya Saha ◽  
Amanda M Cooksey ◽  
Anna K Childers ◽  
Monica Poelchau ◽  
Fiona McCarthy

Genome sequencing of a diverse array of arthropod genomes is already underway and these genomes will be used to study human health, agriculture, biodiversity and ecology. These new genomes are intended to serve as community resources and provide the foundational information that is required to apply omics technologies to a more diverse set of species. However, biologists require genome annotation to use these genomes and derive a better understanding of complex biological systems. Genome annotation incorporates two related but distinct processes: demarcating genes and other elements present in genome sequences (structural annotation); and associating function with genetic elements (functional annotation). While there are well established and freely available workflows for structural annotation of gene identification in newly assembled genomes, workflows for providing the functional annotation required to support functional genomics studies are less well understood. Genome-scale functional annotation is required for functional modeling (enrichment, networks, etc.) and a first-pass genome-wide functional annotation effort can rapidly identify under-represented gene sets for focused community annotation efforts. We present an open source, open access and containerized pipeline for genome-scale functional annotation of insect proteomes and apply it to a diverse range of arthropod species. We show that the performance of the predictions is consistent across a set of arthropod genomes with varying assembly and annotation quality. Complete instructions for running each component of the functional annotation pipeline on the command line, a high performance computing cluster and the CyVerse Discovery Environment can be found at the readthedocs site (https://agbase-docs.readthedocs.io/en/latest/agbase/workflow.html).


2021 ◽  
Author(s):  
Lucas Varella ◽  
Patricia Plentz ◽  
Hugo Watanuki ◽  
Artur Baruchi

Este trabalho explora a orquestração da plataforma HPCC Systems (High Performance Computing Cluster) em ambientes cloud conteinerizados, com a ferramenta de orquestração Kubernetes. O objetivo do trabalho é avaliar as características, benefícios e desafios da implantação da plataforma HPCC Systems nesse paradigma através de diferentes provedores de cloud pública, especificamente Amazon Web Service (AWS) e Microsoft Azure. Os resultados preliminares sugerem que o paradigma de orquestração traz diversos benefícios para a plataforma em questão, mas suposições estritas sobre persistência de armazenamento de dados e recursos compartilhados específicos ao host, entre outras condições, geram desafios ao tentar levar a tecnologia a este ambiente.


Author(s):  
Julia Koehler Leman ◽  
Sergey Lyskov ◽  
Steven Lewis ◽  
Jared Adolf-Bryfogle ◽  
Rebecca F. Alford ◽  
...  

AbstractEach year vast international resources are wasted on irreproducible research. The scientific community has been slow to adopt standard software engineering practices, despite the increases in high-dimensional data, complexities of workflows, and computational environments. Here we show how scientific software applications can be created in a reproducible manner when simple design goals for reproducibility are met. We describe the implementation of a test server framework and 40 scientific benchmarks, covering numerous applications in Rosetta bio-macromolecular modeling. High performance computing cluster integration allows these benchmarks to run continuously and automatically. Detailed protocol captures are useful for developers and users of Rosetta and other macromolecular modeling tools. The framework and design concepts presented here are valuable for developers and users of any type of scientific software and for the scientific community to create reproducible methods. Specific examples highlight the utility of this framework and the comprehensive documentation illustrates the ease of adding new tests in a matter of hours.


2021 ◽  
Vol 251 ◽  
pp. 03033
Author(s):  
Micah Groh ◽  
Norman Buchanan ◽  
Derek Doyle ◽  
James B. Kowalkowski ◽  
Marc Paterno ◽  
...  

Modern experiments in high energy physics analyze millions of events recorded in particle detectors to select the events of interest and make measurements of physics parameters. These data can often be stored as tabular data in files with detector information and reconstructed quantities. Most current techniques for event selection in these files lack the scalability needed for high performance computing environments. We describe our work to develop a high energy physics analysis framework suitable for high performance computing. This new framework utilizes modern tools for reading files and implicit data parallelism. Framework users analyze tabular data using standard, easy-to-use data analysis techniques in Python while the framework handles the file manipulations and parallelism without the user needing advanced experience in parallel programming. In future versions, we hope to provide a framework that can be utilized on a personal computer or a high performance computing cluster with little change to the user code.


Sign in / Sign up

Export Citation Format

Share Document