Portable multi-node LQCD Monte Carlo simulations using OpenACC

Claudio Bonati; Enrico Calore; Massimo D’Elia; Michele Mesiti; Francesco Negro; Francesco Sanfilippo; Sebastiano Fabio Schifano; Giorgio Silvi; Raffaele Tripiccione

doi:10.1142/s0129183118500109

Portable multi-node LQCD Monte Carlo simulations using OpenACC

International Journal of Modern Physics C ◽

10.1142/s0129183118500109 ◽

2018 ◽

Vol 29 (01) ◽

pp. 1850010 ◽

Cited By ~ 8

Author(s):

Claudio Bonati ◽

Enrico Calore ◽

Massimo D’Elia ◽

Michele Mesiti ◽

Francesco Negro ◽

...

Keyword(s):

Monte Carlo ◽

Monte Carlo Simulations ◽

Parallel Programming ◽

Programming Model ◽

State Of The Art ◽

Computer Architectures ◽

Monte Carlo Code ◽

Processor Architectures ◽

Parallel Programming Model ◽

High Level

This paper describes a state-of-the-art parallel Lattice QCD Monte Carlo code for staggered fermions, purposely designed to be portable across different computer architectures, including GPUs and commodity CPUs. Portability is achieved using the OpenACC parallel programming model, used to develop a code that can be compiled for several processor architectures. The paper focuses on parallelization on multiple computing nodes using OpenACC to manage parallelism within the node, and OpenMPI to manage parallelism among the nodes. We first discuss the available strategies to be adopted to maximize performances, we then describe selected relevant details of the code, and finally measure the level of performance and scaling-performance that we are able to achieve. The work focuses mainly on GPUs, which offer a significantly high level of performances for this application, but also compares with results measured on other processors.

MapReduce Parallel Programming Model: A State-of-the-Art Survey

International Journal of Parallel Programming ◽

10.1007/s10766-015-0395-0 ◽

2015 ◽

Vol 44 (4) ◽

pp. 832-866 ◽

Cited By ~ 24

Author(s):

Ren Li ◽

Haibo Hu ◽

Heng Li ◽

Yunsong Wu ◽

Jianxi Yang

Keyword(s):

Parallel Programming ◽

Programming Model ◽

State Of The Art ◽

Parallel Programming Model

Interaction with the User in the SAPFOR System

Russian Digital Libraries Journal ◽

10.26907/1562-5419-2021-24-1-157-183 ◽

2021 ◽

Vol 24 (1) ◽

pp. 157-183

Author(s):

Никита Андреевич Катаев

Keyword(s):

Parallel Programming ◽

Program Transformation ◽

Heterogeneous Computing ◽

Programming Model ◽

Parallel Programs ◽

Parallel Program ◽

Program Parallelization ◽

Parallel Programming Model ◽

The One ◽

High Level

Automation of parallel programming is important at any stage of parallel program development. These stages include profiling of the original program, program transformation, which allows us to achieve higher performance after program parallelization, and, finally, construction and optimization of the parallel program. It is also important to choose a suitable parallel programming model to express parallelism available in a program. On the one hand, the parallel programming model should be capable to map the parallel program to a variety of existing hardware resources. On the other hand, it should simplify the development of the assistant tools and it should allow the user to explore the parallel program the assistant tools generate in a semi-automatic way. The SAPFOR (System FOR Automated Parallelization) system combines various approaches to automation of parallel programming. Moreover, it allows the user to guide the parallelization if necessary. SAPFOR produces parallel programs according to the high-level DVMH parallel programming model which simplify the development of efficient parallel programs for heterogeneous computing clusters. This paper focuses on the approach to semi-automatic parallel programming, which SAPFOR implements. We discuss the architecture of the system and present the interactive subsystem which is useful to guide the SAPFOR through program parallelization. We used the interactive subsystem to parallelize programs from the NAS Parallel Benchmarks in a semi-automatic way. Finally, we compare the performance of manually written parallel programs with programs the SAPFOR system builds.

Toward An Architecture Independent High Level Parallel Programming Model For Artificial Intelligence

Parallel Processing for Artificial Intelligence - Machine Intelligence and Pattern Recognition ◽

10.1016/b978-0-444-81837-9.50009-9 ◽

1994 ◽

pp. 57-66

Author(s):

Mark S. BERLIN

Keyword(s):

Artificial Intelligence ◽

Parallel Programming ◽

Programming Model ◽

Parallel Programming Model ◽

High Level

Portable LQCD Monte Carlo code using OpenACC

EPJ Web of Conferences ◽

10.1051/epjconf/201817509008 ◽

2018 ◽

Vol 175 ◽

pp. 09008

Author(s):

Claudio Bonati ◽

Enrico Calore ◽

Simone Coscetti ◽

Massimo D’Elia ◽

Michele Mesiti ◽

...

Keyword(s):

Monte Carlo ◽

Parallel Programming ◽

Scientific Computing ◽

State Of The Art ◽

Production Level ◽

Monte Carlo Code ◽

Design And Optimization ◽

Code Changes ◽

Many Core ◽

Descriptive Level

Varying from multi-core CPU processors to many-core GPUs, the present scenario of HPC architectures is extremely heterogeneous. In this context, code portability is increasingly important for easy maintainability of applications; this is relevant in scientific computing where code changes are numerous and frequent. In this talk we present the design and optimization of a state-of-the-art production level LQCD Monte Carlo application, using the OpenACC directives model. OpenACC aims to abstract parallel programming to a descriptive level, where programmers do not need to specify the mapping of the code on the target machine. We describe the OpenACC implementation and show that the same code is able to target different architectures, including state-of-the-art CPUs and GPUs.

Design and optimization of a portable LQCD Monte Carlo code using OpenACC

International Journal of Modern Physics C ◽

10.1142/s0129183117500632 ◽

2017 ◽

Vol 28 (05) ◽

pp. 1750063 ◽

Cited By ~ 15

Author(s):

Claudio Bonati ◽

Simone Coscetti ◽

Massimo D’Elia ◽

Michele Mesiti ◽

Francesco Negro ◽

...

Keyword(s):

Monte Carlo ◽

Programming Model ◽

State Of The Art ◽

Monte Carlo Code ◽

Performance Portability ◽

Design And Optimization ◽

Target Architecture ◽

Computing Performance ◽

And Performance ◽

Many Core

The present panorama of HPC architectures is extremely heterogeneous, ranging from traditional multi-core CPU processors, supporting a wide class of applications but delivering moderate computing performance, to many-core Graphics Processor Units (GPUs), exploiting aggressive data-parallelism and delivering higher performances for streaming computing applications. In this scenario, code portability (and performance portability) become necessary for easy maintainability of applications; this is very relevant in scientific computing where code changes are very frequent, making it tedious and prone to error to keep different code versions aligned. In this work, we present the design and optimization of a state-of-the-art production-level LQCD Monte Carlo application, using the directive-based OpenACC programming model. OpenACC abstracts parallel programming to a descriptive level, relieving programmers from specifying how codes should be mapped onto the target architecture. We describe the implementation of a code fully written in OpenAcc, and show that we are able to target several different architectures, including state-of-the-art traditional CPUs and GPUs, with the same code. We also measure performance, evaluating the computing efficiency of our OpenACC code on several architectures, comparing with GPU-specific implementations and showing that a good level of performance-portability can be reached.

Comparison of Experiment With Monte Carlo Simulations on a Reflective Gap Using a Detailed Surface Properties Model

Journal of Heat Transfer ◽

10.1115/1.2825856 ◽

1996 ◽

Vol 118 (2) ◽

pp. 388-393 ◽

Cited By ~ 5

Author(s):

J. Zaworski ◽

J. R. Welty ◽

B. J. Palmer ◽

M. K. Drost

Keyword(s):

Monte Carlo Simulation ◽

Monte Carlo ◽

Monte Carlo Simulations ◽

Grazing Angle ◽

Incident Radiation ◽

Monte Carlo Code ◽

Bidirectional Reflectance Distribution Function ◽

Bidirectional Reflectance ◽

Detailed Surface ◽

Bidirectional Reflectance Distribution

The spatial distribution of light through a rectangular gap bounded by highly reflective, diffuse surfaces was measured and compared with the results of Monte Carlo simulations. Incorporating radiant properties for real surfaces into a Monte Carlo code was seen to be a significant problem; a number of techniques for accomplishing this are discussed. Independent results are reported for measured values of the bidirectional reflectance distribution function over incident polar angles from 0 to 90 deg for a semidiffuse surface treatment (Krylon™ flat white spray paint). The inclusion of this information into a Monte Carlo simulation yielded various levels of agreement with experimental results. The poorest agreement occurred when the incident radiation was at a grazing angle with respect to the surface and the reflectance was nearly specular.

Parallel programming model for the Epiphany many-core coprocessor using threaded MPI

Microprocessors and Microsystems ◽

10.1016/j.micpro.2016.02.006 ◽

2016 ◽

Vol 43 ◽

pp. 95-103 ◽

Cited By ~ 5

Author(s):

James A. Ross ◽

David A. Richie ◽

Song J. Park ◽

Dale R. Shires

Keyword(s):

Parallel Programming ◽

Programming Model ◽

Parallel Programming Model ◽

Many Core

2D-FMFI SAR application on HPC architectures with OmpSs parallel programming model

2012 NASA/ESA Conference on Adaptive Hardware and Systems (AHS) ◽

10.1109/ahs.2012.6268638 ◽

2012 ◽

Author(s):

Fisnik Kraja ◽

Arndt Bode ◽

Xavier Martorell

Keyword(s):

Parallel Programming ◽

Programming Model ◽

Parallel Programming Model

Actors as a parallel programming model

STACS 91 - Lecture Notes in Computer Science ◽

10.1007/bfb0020798 ◽

2005 ◽

pp. 184-195 ◽

Cited By ~ 5

Author(s):

Françoise Baude ◽

Guy Vidal-Naquet

Keyword(s):

Parallel Programming ◽

Programming Model ◽

Parallel Programming Model

A practical parallel programming model

Specification of Parallel Algorithms - DIMACS Series in Discrete Mathematics and Theoretical Computer Science ◽

10.1090/dimacs/018/11 ◽

1994 ◽

pp. 143-160 ◽

Cited By ~ 1

Author(s):

Lawrence Snyder

Keyword(s):

Parallel Programming ◽

Programming Model ◽

Parallel Programming Model