The International Journal of High Performance Computing Applications
Latest Publications


TOTAL DOCUMENTS

944
(FIVE YEARS 142)

H-INDEX

53
(FIVE YEARS 6)

Published By Sage Publications

1094-3420

Author(s):  
Zhi Yao ◽  
Revathi Jambunathan ◽  
Yadong Zeng ◽  
Andrew Nonaka

We present a high-performance coupled electrodynamics–micromagnetics solver for full physical modeling of signals in microelectronic circuitry. The overall strategy couples a finite-difference time-domain approach for Maxwell’s equations to a magnetization model described by the Landau–Lifshitz–Gilbert equation. The algorithm is implemented in the Exascale Computing Project software framework, AMReX, which provides effective scalability on manycore and GPU-based supercomputing architectures. Furthermore, the code leverages ongoing developments of the Exascale Application Code, WarpX, which is primarily being developed for plasma wakefield accelerator modeling. Our temporal coupling scheme provides second-order accuracy in space and time by combining the integration steps for the magnetic field and magnetization into an iterative sub-step that includes a trapezoidal temporal discretization for the magnetization. The performance of the algorithm is demonstrated by the excellent scaling results on NERSC multicore and GPU systems, with a significant (59×) speedup on the GPU using a node-by-node comparison. We demonstrate the utility of our code by performing simulations of an electromagnetic waveguide and a magnetically tunable filter.


Author(s):  
Yuta Hirokawa ◽  
Atsushi Yamada ◽  
Shunsuke Yamada ◽  
Masashi Noda ◽  
Mitsuharu Uemoto ◽  
...  

In the field of optical science, it is becoming increasingly important to observe and manipulate matter at the atomic scale using ultrashort pulsed light. For the first time, we have performed the ab initio simulation solving the Maxwell equation for light electromagnetic fields, the time-dependent Kohn-Sham equation for electrons, and the Newton equation for ions in extended systems. In the simulation, the most time-consuming parts were stencil and nonlocal pseudopotential operations on the electron orbitals as well as fast Fourier transforms for the electron density. Code optimization was thoroughly performed on the Fujitsu A64FX processor to achieve the highest performance. A simulation of amorphous SiO2 thin film composed of more than 10,000 atoms was performed using 27,648 nodes of the Fugaku supercomputer. The simulation achieved excellent time-to-solution with the performance close to the maximum possible value in view of the memory bandwidth bound, as well as excellent weak scalability.


Author(s):  
John A Turner ◽  
James Belak ◽  
Nathan Barton ◽  
Matthew Bement ◽  
Neil Carlson ◽  
...  

Additive manufacturing (AM), or 3D printing, of metals is transforming the fabrication of components, in part by dramatically expanding the design space, allowing optimization of shape and topology. However, although the physical processes involved in AM are similar to those of welding, a field with decades of experimental, modeling, simulation, and characterization experience, qualification of AM parts remains a challenge. The availability of exascale computational systems, particularly when combined with data-driven approaches such as machine learning, enables topology and shape optimization as well as accelerated qualification by providing process-aware, locally accurate microstructure and mechanical property models. We describe the physics components comprising the Exascale Additive Manufacturing simulation environment and report progress using highly resolved melt pool simulations to inform part-scale finite element thermomechanics simulations, drive microstructure evolution, and determine constitutive mechanical property relationships based on those microstructures using polycrystal plasticity. We report on implementation of these components for exascale computing architectures, as well as the multi-stage simulation workflow that provides a unique high-fidelity model of process–structure–property relationships for AM parts. In addition, we discuss verification and validation through collaboration with efforts such as AM-Bench, a set of benchmark test problems under development by a team led by the National Institute of Standards and Technology.


Author(s):  
Pascal R Bähr ◽  
Bruno Lang ◽  
Peer Ueberholz ◽  
Marton Ady ◽  
Roberto Kersevan

Molflow+ is a Monte Carlo (MC) simulation software for ultra-high vacuum, mainly used to simulate pressure in particle accelerators. In this article, we present and discuss the design choices arising in a new implementation of its ray-tracing–based simulation unit for Nvidia RTX Graphics Processing Units (GPUs). The GPU simulation kernel was designed with Nvidia’s OptiX 7 API to make use of modern hardware-accelerated ray-tracing units, found in recent RTX series GPUs based on the Turing and Ampere architectures. Even with the challenges posed by switching to 32 bit computations, our kernel runs much faster than on comparable CPUs at the expense of a marginal drop in calculation precision.


Author(s):  
Emmanuel Agullo ◽  
Mirco Altenbernd ◽  
Hartwig Anzt ◽  
Leonardo Bautista-Gomez ◽  
Tommaso Benacchio ◽  
...  

This work is based on the seminar titled ‘Resiliency in Numerical Algorithm Design for Extreme Scale Simulations’ held March 1–6, 2020, at Schloss Dagstuhl, that was attended by all the authors. Advanced supercomputing is characterized by very high computation speeds at the cost of involving an enormous amount of resources and costs. A typical large-scale computation running for 48 h on a system consuming 20 MW, as predicted for exascale systems, would consume a million kWh, corresponding to about 100k Euro in energy cost for executing 1023 floating-point operations. It is clearly unacceptable to lose the whole computation if any of the several million parallel processes fails during the execution. Moreover, if a single operation suffers from a bit-flip error, should the whole computation be declared invalid? What about the notion of reproducibility itself: should this core paradigm of science be revised and refined for results that are obtained by large-scale simulation? Naive versions of conventional resilience techniques will not scale to the exascale regime: with a main memory footprint of tens of Petabytes, synchronously writing checkpoint data all the way to background storage at frequent intervals will create intolerable overheads in runtime and energy consumption. Forecasts show that the mean time between failures could be lower than the time to recover from such a checkpoint, so that large calculations at scale might not make any progress if robust alternatives are not investigated. More advanced resilience techniques must be devised. The key may lie in exploiting both advanced system features as well as specific application knowledge. Research will face two essential questions: (1) what are the reliability requirements for a particular computation and (2) how do we best design the algorithms and software to meet these requirements? While the analysis of use cases can help understand the particular reliability requirements, the construction of remedies is currently wide open. One avenue would be to refine and improve on system- or application-level checkpointing and rollback strategies in the case an error is detected. Developers might use fault notification interfaces and flexible runtime systems to respond to node failures in an application-dependent fashion. Novel numerical algorithms or more stochastic computational approaches may be required to meet accuracy requirements in the face of undetectable soft errors. These ideas constituted an essential topic of the seminar. The goal of this Dagstuhl Seminar was to bring together a diverse group of scientists with expertise in exascale computing to discuss novel ways to make applications resilient against detected and undetected faults. In particular, participants explored the role that algorithms and applications play in the holistic approach needed to tackle this challenge. This article gathers a broad range of perspectives on the role of algorithms, applications and systems in achieving resilience for extreme scale simulations. The ultimate goal is to spark novel ideas and encourage the development of concrete solutions for achieving such resilience holistically.


Author(s):  
Timothy C Germann

We provide an overview of the six co-design centers within the U.S. Department of Energy’s Exascale Computing Project, each of which is described in more detail in a separate paper in this special issue. We also give a perspective on the evolution of computational co-design.


Author(s):  
Isaac Lyngaas ◽  
Matt Norman ◽  
Youngsung Kim

In this work, we demonstrate the process for porting the cloud resolving model (CRM) used in the Energy Exascale Earth System Model Multi-Scale Modeling Framework (E3SM-MMF) from its original Fortran code base to C++ code using a portability library. This porting process is performed using the Yet Another Kernel Library (YAKL), a simplified C++ portability library that specializes in Fortran porting. In particular, we detail our step-by-step approach for porting the System for Atmospheric Modeling (SAM), the CRM used in E3SM-MMF, using a hybrid Fortran/C++ framework that allows for systematic reproduction and correctness testing of gradually ported YAKL C++ code. Additionally, analysis is done on the performance of the ported code using OLCF’s Summit supercomputer.


Author(s):  
Mojtaba Barzegari ◽  
Liesbet Geris

A combination of reaction–diffusion models with moving-boundary problems yields a system in which the diffusion (spreading and penetration) and reaction (transformation) evolve the system’s state and geometry over time. These systems can be used in a wide range of engineering applications. In this study, as an example of such a system, the degradation of metallic materials is investigated. A mathematical model is constructed of the diffusion-reaction processes and the movement of corrosion front of a magnesium block floating in a chemical solution. The corresponding parallelized computational model is implemented using the finite element method, and the weak and strong-scaling behaviors of the model are evaluated to analyze the performance and efficiency of the employed high-performance computing techniques.


Author(s):  
Seher Acer ◽  
Ariful Azad ◽  
Erik G Boman ◽  
Aydın Buluç ◽  
Karen D. Devine ◽  
...  

Combinatorial algorithms in general and graph algorithms in particular play a critical enabling role in numerous scientific applications. However, the irregular memory access nature of these algorithms makes them one of the hardest algorithmic kernels to implement on parallel systems. With tens of billions of hardware threads and deep memory hierarchies, the exascale computing systems in particular pose extreme challenges in scaling graph algorithms. The codesign center on combinatorial algorithms, ExaGraph, was established to design and develop methods and techniques for efficient implementation of key combinatorial (graph) algorithms chosen from a diverse set of exascale applications. Algebraic and combinatorial methods have a complementary role in the advancement of computational science and engineering, including playing an enabling role on each other. In this paper, we survey the algorithmic and software development activities performed under the auspices of ExaGraph from both a combinatorial and an algebraic perspective. In particular, we detail our recent efforts in porting the algorithms to manycore accelerator (GPU) architectures. We also provide a brief survey of the applications that have benefited from the scalable implementations of different combinatorial algorithms to enable scientific discovery at scale. We believe that several applications will benefit from the algorithmic and software tools developed by the ExaGraph team.


Author(s):  
Francis J Alexander ◽  
James Ang ◽  
Jenna A Bilbrey ◽  
Jan Balewski ◽  
Tiernan Casey ◽  
...  

Rapid growth in data, computational methods, and computing power is driving a remarkable revolution in what variously is termed machine learning (ML), statistical learning, computational learning, and artificial intelligence. In addition to highly visible successes in machine-based natural language translation, playing the game Go, and self-driving cars, these new technologies also have profound implications for computational and experimental science and engineering, as well as for the exascale computing systems that the Department of Energy (DOE) is developing to support those disciplines. Not only do these learning technologies open up exciting opportunities for scientific discovery on exascale systems, they also appear poised to have important implications for the design and use of exascale computers themselves, including high-performance computing (HPC) for ML and ML for HPC. The overarching goal of the ExaLearn co-design project is to provide exascale ML software for use by Exascale Computing Project (ECP) applications, other ECP co-design centers, and DOE experimental facilities and leadership class computing facilities.


Sign in / Sign up

Export Citation Format

Share Document