Runtime mechanisms to survive new HPC architectures: A use case in human respiratory simulations

Computational fluid and particle dynamics (CFPD) simulations are of paramount importance for studying and improving drug effectiveness. Computational requirements of CFPD codes demand high-performance computing (HPC) resources. For these reasons, we introduce and evaluate in this article system software techniques for improving performance and tolerating load imbalance on a state-of-the-art production CFPD code. We demonstrate benefits of these techniques on Intel-, IBM- and Arm-based HPC technologies ranked in the Top500 supercomputers, showing the importance of using mechanisms applied at runtime to improve the performance independently of the underlying architecture. We run a real CFPD simulation of particle tracking on the human respiratory system, showing performance improvements of up to 2×, across different architectures, while applying runtime techniques and keeping constant the computational resources.

Download Full-text

GPU Scaling

International Journal of Information Technology and Web Engineering ◽

10.4018/ijitwe.2014100102 ◽

2014 ◽

Vol 9 (4) ◽

pp. 13-23

Author(s):

Yaser Jararweh ◽

Moath Jarrah ◽

Abdelkader Bousselham

Keyword(s):

High Performance Computing ◽

High Performance ◽

Gpu Computing ◽

State Of The Art ◽

Computing Systems ◽

Current State ◽

Viable Solution ◽

Order Of Magnitude ◽

Computational Resources ◽

Performance Computing

Current state-of-the-art GPU-based systems offer unprecedented performance advantages through accelerating the most compute-intensive portions of applications by an order of magnitude. GPU computing presents a viable solution for the ever-increasing complexities in applications and the growing demands for immense computational resources. In this paper the authors investigate different platforms of GPU-based systems, starting from the Personal Supercomputing (PSC) to cloud-based GPU systems. The authors explore and evaluate the GPU-based platforms and the authors present a comparison discussion against the conventional high performance cluster-based computing systems. The authors' evaluation shows potential advantages of using GPU-based systems for high performance computing applications while meeting different scaling granularities.

Download Full-text

GPU Scaling

Web-Based Services ◽

10.4018/978-1-4666-9466-8.ch105 ◽

2016 ◽

pp. 2373-2384

Author(s):

Yaser Jararweh ◽

Moath Jarrah ◽

Abdelkader Bousselham

Keyword(s):

High Performance Computing ◽

High Performance ◽

Gpu Computing ◽

State Of The Art ◽

Computing Systems ◽

Current State ◽

Viable Solution ◽

Order Of Magnitude ◽

Computational Resources ◽

Performance Computing

Download Full-text

Improving speed of models for improved real-world decision-making

10.31235/osf.io/sqy8c ◽

2021 ◽

Author(s):

Jason Thompson ◽

Haifeng Zhao ◽

Sachith Seneviratne ◽

Rohan Byrne ◽

Rajith Vidanaarachichi ◽

...

Keyword(s):

High Performance Computing ◽

Real World ◽

High Performance ◽

Time Pressure ◽

Practical Importance ◽

Model Development ◽

Sudden Onset ◽

Health Crisis ◽

Performance Improvements ◽

Performance Computing

The sudden onset of the COVID-19 global health crisis and as-sociated economic and social fall-out has highlighted the im-portance of speed in modeling emergency scenarios so that ro-bust, reliable evidence can be placed in policy and decision-makers’ hands as swiftly as possible. For computational social scientists who are building complex policy models but who lack ready access to high-performance computing facilities, such time-pressure can hinder effective engagement. Popular and ac-cessible agent-based modeling platforms such as NetLogo can be fast to develop, but slow to run when exploring broad param-eter spaces on individual workstations. However, while deploy-ment on high-performance computing (HPC) clusters can achieve marked performance improvements, transferring models from workstations to HPC clusters can also be a technically challenging and time-consuming task. In this paper we present a set of generic templates that can be used and adapted by NetLogo users who have access to HPC clusters but require ad-ditional support for deploying their models on such infrastruc-ture. We show that model run-time speed improvements of be-tween 200x and 400x over desktop machines are possible using 1) a benchmark ‘wolf-sheep predation’ model in addition to 2) an example drawn from our own work modeling the spread of COVID-19 in Victoria, Australia. We describe how a focus on improving model speed is non-trivial for model development and discuss its practical importance for improved policy and de-cision-making in the real world. We provide all associated doc-umentation in a linked git repository.

Download Full-text

Architecture for the Integration of High Performance Computing Applications in PLM

Volume 2: 27th Computers and Information in Engineering Conference, Parts A and B ◽

10.1115/detc2007-35185 ◽

2007 ◽

Author(s):

Reiner Anderl ◽

Orkun Yaman

Keyword(s):

Data Management ◽

High Performance Computing ◽

High Performance ◽

State Of The Art ◽

Reference Information ◽

Simulation Domain ◽

Architectural Framework ◽

Industrial Context ◽

Performance Computing ◽

Integrate Data

High Performance Computing (HPC) has become ubiquitous for simulations in the industrial context. To identify the requirements for integration of HPC-relevant data and processes a survey has been conducted concerning the German car manufacturers and service and component suppliers. This contribution presents the results of the evaluation and suggests an architecture concept to integrate data and workflows related with CAE and HPC-facilities in PLM. It describes the state of the art of HPC-applications within the simulation domain. Intensive efforts are currently invested on CAE-data management. However, an approach to systematic data management of HPC does not exist. This study states importance of an integrating approach for data management of HPC-applications and develops an architectural framework to implement HPC-data management into the existing PLM landscape. Requirements on key functionalities and interfaces are defined as well as a framework for a reference information model is conceptualized.

Download Full-text

GITIRBio: A Semantic and Distributed Service Oriented- Architecture for Bioinformatics Pipeline

Journal of Integrative Bioinformatics ◽

10.1515/jib-2015-255 ◽

2015 ◽

Vol 12 (1) ◽

pp. 1-15 ◽

Cited By ~ 1

Author(s):

Luis F. Castillo ◽

Germán López-Gartner ◽

Gustavo A. Isaza ◽

Mariana Sánchez ◽

Jeferson Arango ◽

...

Keyword(s):

High Performance ◽

Service Oriented Architecture ◽

Command Line ◽

Bioinformatics Pipeline ◽

Front End ◽

First Approximation ◽

Service Oriented ◽

Computational Resources ◽

Multiple Sequences ◽

Performance Computing

Summary The need to process large quantities of data generated from genomic sequencing has resulted in a difficult task for life scientists who are not familiar with the use of command-line operations or developments in high performance computing and parallelization. This knowledge gap, along with unfamiliarity with necessary processes, can hinder the execution of data processing tasks. Furthermore, many of the commonly used bioinformatics tools for the scientific community are presented as isolated, unrelated entities that do not provide an integrated, guided, and assisted interaction with the scheduling facilities of computational resources or distribution, processing and mapping with runtime analysis. This paper presents the first approximation of a Web Services platform-based architecture (GITIRBio) that acts as a distributed front-end system for autonomous and assisted processing of parallel bioinformatics pipelines that has been validated using multiple sequences. Additionally, this platform allows integration with semantic repositories of genes for search annotations. GITIRBio is available at: http://c-head.ucaldas.edu.co:8080/gitirbio

Download Full-text

Service for parallel applications based on JINR cloud and HybriLIT resources

EPJ Web of Conferences ◽

10.1051/epjconf/201921407012 ◽

2019 ◽

Vol 214 ◽

pp. 07012 ◽

Cited By ~ 1

Author(s):

Nikita Balashov ◽

Maxim Bashashin ◽

Pavel Goncharov ◽

Ruslan Kuchumov ◽

Nikolay Kutovskiy ◽

...

Keyword(s):

High Performance ◽

Cloud Service ◽

Parallel Applications ◽

Cloud Infrastructure ◽

Modular Architecture ◽

Practical Applications ◽

Speed Up ◽

Scientific Results ◽

Computational Resources ◽

Performance Computing

Cloud computing has become a routine tool for scientists in many fields. The JINR cloud infrastructure provides JINR users with computational resources to perform various scientific calculations. In order to speed up achievements of scientific results the JINR cloud service for parallel applications has been developed. It consists of several components and implements a flexible and modular architecture which allows to utilize both more applications and various types of resources as computational backends. An example of using the Cloud&HybriLIT resources in scientific computing is the study of superconducting processes in the stacked long Josephson junctions (LJJ). The LJJ systems have undergone intensive research because of the perspective of practical applications in nano-electronics and quantum computing. In this contribution we generalize the experience in application of the Cloud&HybriLIT resources for high performance computing of physical characteristics in the LJJ system.

Download Full-text

PAEAN: Portable and scalable runtime support for parallel Haskell dialects

Journal of Functional Programming ◽

10.1017/s0956796816000010 ◽

2016 ◽

Vol 26 ◽

Cited By ~ 1

Author(s):

JOST BERTHOLD ◽

HANS-WOLFGANG LOIDL ◽

KEVIN HAMMOND

Keyword(s):

Shared Memory ◽

High Performance ◽

Parallel Machines ◽

State Of The Art ◽

Computing Systems ◽

Programming Abstraction ◽

Work Distribution ◽

High Level ◽

Parallelism Model ◽

Performance Computing

AbstractOver time, several competing approaches to parallel Haskell programming have emerged. Different approaches support parallelism at various different scales, ranging from small multicores to massively parallel high-performance computing systems. They also provide varying degrees of control, ranging from completely implicit approaches to ones providing full programmer control. Most current designs assume a shared memory model at the programmer, implementation and hardware levels. This is, however, becoming increasingly divorced from the reality at the hardware level. It also imposes significant unwanted runtime overheads in the form of garbage collection synchronisation etc. What is needed is an easy way to abstract over the implementation and hardware levels, while presenting a simple parallelism model to the programmer. The PArallEl shAred Nothing runtime system design aims to provide a portable and high-level shared-nothing implementation platform for parallel Haskell dialects. It abstracts over major issues such as work distribution and data serialisation, consolidating existing, successful designs into a single framework. It also provides an optional virtual shared-memory programming abstraction for (possibly) shared-nothing parallel machines, such as modern multicore/manycore architectures or cluster/cloud computing systems. It builds on, unifies and extends, existing well-developed support for shared-memory parallelism that is provided by the widely used GHC Haskell compiler. This paper summarises the state-of-the-art in shared-nothing parallel Haskell implementations, introduces the PArallEl shAred Nothing abstractions, shows how they can be used to implement three distinct parallel Haskell dialects, and demonstrates that good scalability can be obtained on recent parallel machines.

Download Full-text

High-performance computing systems: Status and outlook

Acta Numerica ◽

10.1017/s0962492912000050 ◽

2012 ◽

Vol 21 ◽

pp. 379-474 ◽

Cited By ~ 36

Author(s):

J. J. Dongarra ◽

A. J. van der Steen

Keyword(s):

High Performance Computing ◽

High Performance ◽

State Of The Art ◽

Computing Systems ◽

Future Developments ◽

Steady Growth ◽

Current State ◽

Near Future ◽

Performance Computing ◽

Shed Light

This article describes the current state of the art of high-performance computing systems, and attempts to shed light on near-future developments that might prolong the steady growth in speed of such systems, which has been one of their most remarkable characteristics. We review the different ways devised to speed them up, both with regard to components and their architecture. In addition, we discuss the requirements for software that can take advantage of existing and future architectures.

Download Full-text

DISSIPATIVE PARTICLE DYNAMICS: INTRODUCTION, METHODOLOGY AND COMPLEX FLUID APPLICATIONS — A REVIEW

International Journal of Applied Mechanics ◽

10.1142/s1758825109000381 ◽

2009 ◽

Vol 01 (04) ◽

pp. 737-763 ◽

Cited By ~ 100

Author(s):

E. MOEENDARBARY ◽

T. Y. NG ◽

M. ZANGENEH

Keyword(s):

High Performance ◽

Dissipative Particle Dynamics ◽

Coarse Graining ◽

Particle Dynamics ◽

Hydrodynamic Behavior ◽

Computational Speed ◽

Speed Up ◽

Complex Fluid ◽

Performance Computing ◽

Dpd Simulation

The dissipative particle dynamics (DPD) technique is a relatively new mesoscale technique which was initially developed to simulate hydrodynamic behavior in mesoscopic complex fluids. It is essentially a particle technique in which molecules are clustered into the said particles, and this coarse graining is a very important aspect of the DPD as it allows significant computational speed-up. This increased computational efficiency, coupled with the recent advent of high performance computing, has subsequently enabled researchers to numerically study a host of complex fluid applications at a refined level. In this review, we trace the developments of various important aspects of the DPD methodology since it was first proposed in the in the early 1990's. In addition, we review notable published works which employed DPD simulation for complex fluid applications.

Download Full-text

Resilient gossip-inspired all-reduce algorithms for high-performance computing: Potential, limitations, and open questions

The International Journal of High Performance Computing Applications ◽

10.1177/1094342018762531 ◽

2018 ◽

Vol 33 (2) ◽

pp. 366-383

Author(s):

Marc Casas ◽

Wilfried N Gansterer ◽

Elias Wimmer

Keyword(s):

Fault Tolerance ◽

High Performance Computing ◽

High Performance ◽

State Of The Art ◽

The State ◽

Reduction Algorithm ◽

Data Corruption ◽

Parallel Reduction ◽

Open Questions ◽

Performance Computing

We investigate the usefulness of gossip-based reduction algorithms in a high-performance computing (HPC) context. We compare them to state-of-the-art deterministic parallel reduction algorithms in terms of fault tolerance and resilience against silent data corruption (SDC) as well as in terms of performance and scalability. New gossip-based reduction algorithms are proposed, which significantly improve the state-of-the-art in terms of resilience against SDC. Moreover, a new gossip-inspired reduction algorithm is proposed, which promises a much more competitive runtime performance in an HPC context than classical gossip-based algorithms, in particular for low accuracy requirements.

Download Full-text