scholarly journals Portable LQCD Monte Carlo code using OpenACC

2018 ◽  
Vol 175 ◽  
pp. 09008
Author(s):  
Claudio Bonati ◽  
Enrico Calore ◽  
Simone Coscetti ◽  
Massimo D’Elia ◽  
Michele Mesiti ◽  
...  

Varying from multi-core CPU processors to many-core GPUs, the present scenario of HPC architectures is extremely heterogeneous. In this context, code portability is increasingly important for easy maintainability of applications; this is relevant in scientific computing where code changes are numerous and frequent. In this talk we present the design and optimization of a state-of-the-art production level LQCD Monte Carlo application, using the OpenACC directives model. OpenACC aims to abstract parallel programming to a descriptive level, where programmers do not need to specify the mapping of the code on the target machine. We describe the OpenACC implementation and show that the same code is able to target different architectures, including state-of-the-art CPUs and GPUs.

2017 ◽  
Vol 28 (05) ◽  
pp. 1750063 ◽  
Author(s):  
Claudio Bonati ◽  
Simone Coscetti ◽  
Massimo D’Elia ◽  
Michele Mesiti ◽  
Francesco Negro ◽  
...  

The present panorama of HPC architectures is extremely heterogeneous, ranging from traditional multi-core CPU processors, supporting a wide class of applications but delivering moderate computing performance, to many-core Graphics Processor Units (GPUs), exploiting aggressive data-parallelism and delivering higher performances for streaming computing applications. In this scenario, code portability (and performance portability) become necessary for easy maintainability of applications; this is very relevant in scientific computing where code changes are very frequent, making it tedious and prone to error to keep different code versions aligned. In this work, we present the design and optimization of a state-of-the-art production-level LQCD Monte Carlo application, using the directive-based OpenACC programming model. OpenACC abstracts parallel programming to a descriptive level, relieving programmers from specifying how codes should be mapped onto the target architecture. We describe the implementation of a code fully written in OpenAcc, and show that we are able to target several different architectures, including state-of-the-art traditional CPUs and GPUs, with the same code. We also measure performance, evaluating the computing efficiency of our OpenACC code on several architectures, comparing with GPU-specific implementations and showing that a good level of performance-portability can be reached.


2018 ◽  
Vol 29 (01) ◽  
pp. 1850010 ◽  
Author(s):  
Claudio Bonati ◽  
Enrico Calore ◽  
Massimo D’Elia ◽  
Michele Mesiti ◽  
Francesco Negro ◽  
...  

This paper describes a state-of-the-art parallel Lattice QCD Monte Carlo code for staggered fermions, purposely designed to be portable across different computer architectures, including GPUs and commodity CPUs. Portability is achieved using the OpenACC parallel programming model, used to develop a code that can be compiled for several processor architectures. The paper focuses on parallelization on multiple computing nodes using OpenACC to manage parallelism within the node, and OpenMPI to manage parallelism among the nodes. We first discuss the available strategies to be adopted to maximize performances, we then describe selected relevant details of the code, and finally measure the level of performance and scaling-performance that we are able to achieve. The work focuses mainly on GPUs, which offer a significantly high level of performances for this application, but also compares with results measured on other processors.


2021 ◽  
Vol 247 ◽  
pp. 02034
Author(s):  
P. Mala ◽  
A. Pautz ◽  
H. Ferroukhi ◽  
A. Vasiliev

Currently, safety analyses mostly rely on codes which solve both the neutronics and the thermal-hydraulics with assembly-wise nodes resolution as multiphysics heterogeneous transport solvers are still too time and memory expensive. The pin-by-pin homogenized codes can be seen as a bridge between the heterogeneous codes and the traditional nodal assembly-wise calculations. In this work, the pin-by-pin simplified transport solver Tortin has been coupled with a sub-channel code COBRA-TF. The verification of the 3D solver of Tortin is presented at first, showing very good agreement in terms of axial and radial power profile with the Monte Carlo code SERPENT for a small minicore and with the state-of-the-art nodal code SIMULATE5 for a quarter core without feedback. Then the results of Tortin+COBRA-TF are compared with SIMULATE5 for one assembly problem with feedback. The axial profiles of power and moderator temperature show good agreement, while the fuel temperature differ by up to 40 K. This is caused mainly by different gap and fuel conductance parameters used in COBRA-TF and in SIMULATE5.


2015 ◽  
Vol 82 ◽  
pp. 90-97 ◽  
Author(s):  
Paul K. Romano ◽  
Nicholas E. Horelik ◽  
Bryan R. Herman ◽  
Adam G. Nelson ◽  
Benoit Forget ◽  
...  

Kerntechnik ◽  
2015 ◽  
Vol 80 (4) ◽  
pp. 394-401 ◽  
Author(s):  
S. S. Aleshin ◽  
S. S. Gorodkov ◽  
A. I. Shcherenko

Sign in / Sign up

Export Citation Format

Share Document