scholarly journals Achieving Efficient Strong Scaling with PETSc Using Hybrid MPI/OpenMP Optimisation

Author(s):  
Michael Lange ◽  
Gerard Gorman ◽  
Michèle Weiland ◽  
Lawrence Mitchell ◽  
James Southern
Keyword(s):  
Author(s):  
Bálint Joó ◽  
Mike A. Clark

The QUDA library for optimized lattice quantum chromodynamics using GPUs, combined with a high-level application framework such as the Chroma software system, provides a powerful tool for computing quark propagators, a key step in current calculations of hadron spectroscopy, nuclear structure, and nuclear forces. In this contribution we discuss our experiences, including performance and strong scaling of the QUDA library and Chroma on the Edge Cluster at Lawrence Livermore National Laboratory and on various clusters at Jefferson Lab. We highlight some scientific successes and consider future directions for graphics processing units in lattice quantum chromodynamics calculations.


2019 ◽  
Vol 25 (S2) ◽  
pp. 298-299 ◽  
Author(s):  
Markus Kühbach ◽  
Priyanshu Bajaj ◽  
Andrew Breen ◽  
Eric A. Jägle ◽  
Baptiste Gault

2022 ◽  
Author(s):  
Jonathan Vincent ◽  
Jing Gong ◽  
Martin Karp ◽  
Adam Peplinski ◽  
Niclas Jansson ◽  
...  
Keyword(s):  

Author(s):  
Blake G. Fitch ◽  
Aleksandr Rayshubskiy ◽  
Maria Eleftheriou ◽  
T. J. Christopher Ward ◽  
Mark Giampapa ◽  
...  

Author(s):  
M. Baity-Jesi ◽  
R. A. Baños ◽  
A. Cruz ◽  
L. A. Fernandez ◽  
J. M. Gil-Narvion ◽  
...  
Keyword(s):  

2020 ◽  
Vol 499 (2) ◽  
pp. 1841-1853
Author(s):  
Natascha Manger ◽  
Hubert Klahr ◽  
Wilhelm Kley ◽  
Mario Flock

ABSTRACT Theoretical models of protoplanetary discs have shown the vertical shear instability (VSI) to be a prime candidate to explain turbulence in the dead zone of the disc. However, simulations of the VSI have yet to show consistent levels of key disc turbulence parameters like the stress-to-pressure ratio α. We aim to reconcile these different values by performing a parameter study on the VSI with focus on the disc density gradient p and aspect ratio h = H/R. We use full 2π 3D simulations of the disc for chosen set of both parameters. All simulations are evolved for 1000 reference orbits, at a resolution of 18 cells per h. We find that the saturated stress-to-pressure ratio in our simulations is dependent on the disc aspect ratio with a strong scaling of α∝h2.6, in contrast to the traditional α model, where viscosity scales as ν∝αh2 with a constant α. We also observe consistent formation of large scale vortices across all investigated parameters. The vortices show uniformly aspect ratios of χ ≈ 10 and radial widths of approximately 1.5H. With our findings we can reconcile the different values reported for the stress-to-pressure ratio from both isothermal and full radiation hydrodynamics models, and show long-term evolution effects of the VSI that could aide in the formation of planetesimals.


2020 ◽  
Vol 173 ◽  
pp. 109359 ◽  
Author(s):  
Jens Glaser ◽  
Peter S. Schwendeman ◽  
Joshua A. Anderson ◽  
Sharon C. Glotzer
Keyword(s):  

Author(s):  
Amanda Bienz ◽  
William D Gropp ◽  
Luke N Olson

Algebraic multigrid (AMG) is often viewed as a scalable [Formula: see text] solver for sparse linear systems. Yet, AMG lacks parallel scalability due to increasingly large costs associated with communication, both in the initial construction of a multigrid hierarchy and in the iterative solve phase. This work introduces a parallel implementation of AMG that reduces the cost of communication, yielding improved parallel scalability. It is common in Message Passing Interface (MPI), particularly in the MPI-everywhere approach, to arrange inter-process communication, so that communication is transported regardless of the location of the send and receive processes. Performance tests show notable differences in the cost of intra- and internode communication, motivating a restructuring of communication. In this case, the communication schedule takes advantage of the less costly intra-node communication, reducing both the number and the size of internode messages. Node-centric communication extends to the range of components in both the setup and solve phase of AMG, yielding an increase in the weak and strong scaling of the entire method.


2016 ◽  
Vol 9 (4) ◽  
pp. 1413-1422 ◽  
Author(s):  
Fabian Jakub ◽  
Bernhard Mayer

Abstract. The recently developed 3-D TenStream radiative transfer solver was integrated into the University of California, Los Angeles large-eddy simulation (UCLA-LES) cloud-resolving model. This work documents the overall performance of the TenStream solver as well as the technical challenges of migrating from 1-D schemes to 3-D schemes. In particular the employed Monte Carlo spectral integration needed to be reexamined in conjunction with 3-D radiative transfer. Despite the fact that the spectral sampling has to be performed uniformly over the whole domain, we find that the Monte Carlo spectral integration remains valid. To understand the performance characteristics of the coupled TenStream solver, we conducted weak as well as strong-scaling experiments. In this context, we investigate two matrix preconditioner: geometric algebraic multigrid preconditioning (GAMG) and block Jacobi incomplete LU (ILU) factorization and find that algebraic multigrid preconditioning performs well for complex scenes and highly parallelized simulations. The TenStream solver is tested for up to 4096 cores and shows a parallel scaling efficiency of 80–90 % on various supercomputers. Compared to the widely employed 1-D delta-Eddington two-stream solver, the computational costs for the radiative transfer solver alone increases by a factor of 5–10.


Sign in / Sign up

Export Citation Format

Share Document