scholarly journals MagIC v5.10: a two-dimensional message-passing interface (MPI) distribution for pseudo-spectral magnetohydrodynamics simulations in spherical geometry

2021 ◽  
Vol 14 (12) ◽  
pp. 7477-7495
Author(s):  
Rafael Lago ◽  
Thomas Gastine ◽  
Tilman Dannert ◽  
Markus Rampp ◽  
Johannes Wicht

Abstract. We discuss two parallelization schemes for MagIC, an open-source, high-performance, pseudo-spectral code for the numerical solution of the magnetohydrodynamics equations in a rotating spherical shell. MagIC calculates the non-linear terms on a numerical grid in spherical coordinates, while the time step updates are performed on radial grid points with a spherical harmonic representation of the lateral directions. Several transforms are required to switch between the different representations. The established hybrid parallelization of MagIC uses message-passing interface (MPI) distribution in radius and relies on existing fast spherical transforms using OpenMP. Our new two-dimensional MPI decomposition implementation also distributes the latitudes or the azimuthal wavenumbers across the available MPI tasks and compute cores. We discuss several non-trivial algorithmic optimizations and the different data distribution layouts employed by our scheme. In particular, the two-dimensional distribution data layout yields a code that strongly scales well beyond the limit of the current one-dimensional distribution. We also show that the two-dimensional distribution implementation, although not yet fully optimized, can already be faster than the existing finely optimized hybrid parallelization when using many thousands of CPU cores. Our analysis indicates that the two-dimensional distribution variant can be further optimized to also surpass the performance of the one-dimensional distribution for a few thousand cores.

2021 ◽  
Author(s):  
Rafael Lago ◽  
Thomas Gastine ◽  
Tilman Dannert ◽  
Markus Rampp ◽  
Johannes Wicht

Abstract. We discuss two parallelization schemes for MagIC, an open-source, high-performance, pseudo-spectral code for the numerical solution of the magneto hydrodynamics equations in a rotating spherical shell. MagIC calculates the non-linear terms on a numerical grid in spherical coordinates while the time step updates are performed on radial grid points with a spherical harmonic representation of the lateral directions. Several transforms are required to switch between the different representations. The established hybrid implementation of MagIC uses MPI-parallelization in radius and relies on existing fast spherical transforms using OpenMP. Our new two-dimensional MPI decomposition implementation also distributes the latitudes or the azimuthal wavenumbers across the available MPI tasks/compute cores. We discuss several non-trivial algorithmic optimizations and the different data distribution layouts employed by our scheme. In particular, the two-dimensional distribution data layout yields a code that strongly scales well beyond the limit of the current one-dimensional distribution. We also show that the two-dimensional distribution implementation, although not yet fully optimized, can already be faster than the existing finely optimized hybrid implementation when using many thousands of CPU cores. Our analysis indicates that the two-dimensional distribution variant can be further optimized to also surpass the performance of the one-dimensional distribution for a few thousand cores.


Author(s):  
Ganesh Hegde ◽  
Madhu Gattumane

Improvement in accuracy without sacrificing stability and convergence of the solution to unsteady diffusion heat transfer problems by computational method of enhanced explicit scheme (EES), has been achieved and demonstrated, through transient one dimensional and two dimensional heat conduction. The truncation error induced in the explicit scheme using finite difference technique is eliminated by optimization of partial derivatives in the Taylor series expansion, by application of interface theory developed by the authors. This theory, in its simple terms gives the optimum values to the decision vectors in a redundant linear equation. The time derivatives and the spatial partial derivatives in the transient heat conduction, take the values depending on the time step chosen and grid size assumed. The time correction factor and the space correction factor defined by step sizes govern the accuracy, stability and convergence of EES. The comparison of the results of EES with analytical results, show decreased error as compared to the result of explicit scheme. The paper has an objective of reducing error in the explicit scheme by elimination of truncation error introduced by neglecting the higher order terms in the expansion of the governing function. As the pilot examples of the exercise, the implementation is aimed at solving one-dimensional and two-dimensional problems of transient heat conduction and compared with the results cited in the referred literature.


Geophysics ◽  
2013 ◽  
Vol 78 (2) ◽  
pp. F7-F15 ◽  
Author(s):  
Robin M. Weiss ◽  
Jeffrey Shragge

Efficiently modeling seismic data sets in complex 3D anisotropic media by solving the 3D elastic wave equation is an important challenge in computational geophysics. Using a stress-stiffness formulation on a regular grid, we tested a 3D finite-difference time-domain solver using a second-order temporal and eighth-order spatial accuracy stencil that leverages the massively parallel architecture of graphics processing units (GPUs) to accelerate the computation of key kernels. The relatively small memory of an individual GPU limits the model domain sizes that can be computed on a single device. To circumvent this constraint and move toward modeling industry-sized 3D anisotropic elastic data sets, we parallelized computation across multiple GPU devices by using domain decomposition and, for each time step, employing an interdevice communication protocol to exchange data values falling within interior boundaries of each subdomain. For two or more GPU devices within a single compute node, we use direct peer-to-peer (i.e., GPU-to-GPU) communication, whereas for networked nodes we employed message-passing interface directives to route data over the network. Our 2D GPU-based anisotropic elastic modeling tests achieved a [Formula: see text] speedup relative to an OpenMP CPU implementation run on an eight-core machine, whereas our 3D tests using dual-GPU devices produced up to a [Formula: see text] speedup. The performance boost afforded by the GPU architecture allowed us to model seismic data for 3D anisotropic elastic models at lower hardware cost and in less time than has been previously possible.


Author(s):  
Yu Liu ◽  
Michael Nishimura ◽  
Marat Seydaliev ◽  
Markus Piro

Recent trends in nuclear reactor performance and safety analyses increasingly rely on multiscale multiphysics computer simulations to enhance predictive capabilities by replacing conventional methods that are largely empirically based with a more scientifically based methodology. Through this approach, one addresses the issue of traditionally employing a suite of stand-alone codes that independently simulate various physical phenomena that were previously disconnected. Multiple computer simulations of different phenomena must exchange data during runtime to address these interdependencies. Previously, recommendations have been made regarding various approaches for piloting different design options of data coupling for multiphysics systems (Seydaliev and Caswell, 2014, “CORBA and MPI Based “Backbone” for Coupling Advanced Simulation Tools,” AECL Nucl. Rev., 3(2), pp. 83–90). This paper describes progress since the initial pilot study that outlined the implementation and execution of a new distribution framework, referred to as “Backbone,” to provide the necessary runtime exchange of data between different codes. The Backbone, currently under development at the Canadian Nuclear Laboratories (CNL), is a hybrid design using both common object request broker architecture (CORBA) and message passing interface (MPI) systems. This paper also presents two preliminary cases for coupling existing nuclear performance and safety analysis codes used for simulating fuel behavior, fission product release, thermal hydraulics, and neutron transport through the Backbone. Additionally, a pilot study presents a few strategies of a new time step controller (TSC) to synchronize the codes coupled through the Backbone. A performance and fidelity comparison is presented between a simple heuristic method for determining time step length and a more advanced third-order method, which was selected to maximize configurability and effectiveness of temporal integration, saving time steps and reducing wasted computation. The net effect of the foregoing features of the Backbone is to provide a practical toolset to couple existing and newly developed codes—which may be written in different programming languages and used on different operating systems—with minimal programming effort to enhance predictions of nuclear reactor performance and safety.


Mathematics ◽  
2021 ◽  
Vol 9 (24) ◽  
pp. 3267
Author(s):  
Alexander Sukhinov ◽  
Valentina Sidoryakina

The initial boundary value problem for the 3D convection-diffusion equation corresponding to the mathematical model of suspended matter transport in coastal marine systems and extended shallow water bodies is considered. Convective and diffusive transport operators in horizontal and vertical directions for this type of problem have significantly different physical and spectral properties. In connection with the above, a two-dimensional–one-dimensional splitting scheme has been built—a three-dimensional analog of the Peaceman–Rachford alternating direction scheme, which is suitable for the operational suspension spread prediction in coastal systems. The paper has proved the theorem of stability solution with respect to the initial data and functions of the right side, in the case of time-independent operators in special energy norms determined by one of the splitting scheme operators. The accuracy has been investigated, which, as in the case of the Peaceman–Rachford scheme, with the special definition of boundary conditions on a fractional time step, is the value of the second order in dependency of time and spatial steps. The use of this approach makes it possible to obtain parallel algorithms for solving grid convection-diffusion equations which are economical in the sense of total time of problem solution on multiprocessor systems, which includes time for arithmetic operations realization and the one required to carry of information exchange between processors.


2014 ◽  
Vol 51 (01) ◽  
pp. 162-173
Author(s):  
Ora E. Percus ◽  
Jerome K. Percus

We consider a one-dimensional discrete symmetric random walk with a reflecting boundary at the origin. Generating functions are found for the two-dimensional probability distribution P{S n = x, max1≤j≤n S n = a} of being at position x after n steps, while the maximal location that the walker has achieved during these n steps is a. We also obtain the familiar (marginal) one-dimensional distribution for S n = x, but more importantly that for max1≤j≤n S j = a asymptotically at fixed a 2 / n. We are able to compute and compare the expectations and variances of the two one-dimensional distributions, finding that they have qualitatively similar forms, but differ quantitatively in the anticipated fashion.


2011 ◽  
Vol 64 (5) ◽  
pp. 1016-1024 ◽  
Author(s):  
J. Leandro ◽  
S. Djordjević ◽  
A. S. Chen ◽  
D. A. Savić ◽  
M. Stanić

Recently increased flood events have been prompting researchers to improve existing coupled flood-models such as one-dimensional (1D)/1D and 1D/two-dimensional (2D) models. While 1D/1D models simulate sewer and surface networks using a one-dimensional approach, 1D/2D models represent the surface network by a two-dimensional surface grid. However their application raises two issues to urban flood modellers: (1) stormwater systems planning/emergency or risk analysis demands for fast models, and the 1D/2D computational time is prohibitive, (2) and the recognized lack of field data (e.g. Hunter et al. (2008)) causes difficulties for the calibration/validation of 1D/1D models. In this paper we propose to overcome these issues by calibrating a 1D/1D model with the results of a 1D/2D model. The flood-inundation results show that: (1) 1D/2D results can be used to calibrate faster 1D/1D models, (2) the 1D/1D model is able to map the 1D/2D flood maximum extent well, and the flooding limits satisfactorily in each time-step, (3) the 1D/1D model major differences are the instantaneous flow propagation and overestimation of the flood-depths within surface-ponds, (4) the agreement in the volume surcharged by both models is a necessary condition for the 1D surface-network validation and (5) the agreement of the manholes discharge shapes measures the fitness of the calibrated 1D surface-network.


2020 ◽  
Author(s):  
Jason Louis Turner ◽  
Samuel N. Stechmann

Abstract. Parallel computing can offer substantial speedup of numerical simulations in comparison to serial computing, as parallel computing uses many processors simultaneously rather than a single processor. However, it typically also requires substantial time and effort to convert a serial code into a parallel code. Here, a new module is developed to reduce the time and effort required to parallelize a serial code. The tested version of the module is written in the Fortran programming language,while the framework could also be extended to other languages (C++, Python, Julia, etc.). The Message Passing Interface is used to allow for either shared-memory or distributed-memory computer architectures. The software is designed for solving partial differential equations on a rectangular two-dimensional or three-dimensional domain, using finite difference, finite volume, pseudo-spectral, or other similar numerical methods. Examples are provided for two idealized models of atmospheric and oceanic fluid dynamics: the two-level quasi-geostrophic equations, and the stochastic heat equation as a model for turbulent advection–diffusion of either water vapor and clouds or sea surface height variability. In tests of the parallelized code, the strong scaling efficiency for the finite difference code is seen to be roughly 80 % to 90 %, which is achieved by adding roughly only 10 new lines to the serial code. Therefore, EZ Parallel provides great benefits with minimal additional effort.


2012 ◽  
Vol 22 (7) ◽  
pp. 14-24 ◽  
Author(s):  
José Miguel Vargas-Félix ◽  
Salvador Botello-Rionda

The Finite Element Method (FEM) is used to solve problems like solid deformation and heat diffusion in domains with complex geometries. This kind of geometries requires discretization with millions of elements; this is equivalent to solve systems of equations with sparse matrices and tens or hundreds of millions of variables. The aim is to use computer clusters to solve these systems. The solution method used is Schur substructuration. Using it is possible to divide a large system of equations into many small ones to solve them more efficiently. This method allows parallelization. MPI (Message Passing Interface) is used to distribute the systems of equations to solve each one in a computer of a cluster. Each system of equations is solved using a solver implemented to use OpenMP as a local parallelization method.


Author(s):  
Alessandro A. de Lima ◽  
Julio R. Meneghini ◽  
Marcio Mourelle ◽  
Enrique Casaprima ◽  
Ricardo B. Flatschart

In this paper the dynamic response and fatigue analysis of a marine SCR (Steel Catenary Riser) due to vortex shedding is numerically investigated. The riser is divided in two-dimensional sections along the riser length. The discrete vortex method (DVM) is employed for the assessment of the hydrodynamic forces acting on these two-dimensional sections. The hydrodynamic sections are solved independently, and the coupling among the sections is taken into account by the solution of the structure in the time domain by the finite element method implemented in ANFLEX code [1]. Parallel processing is employed to improve the performance of the method. A master-slave approach via MPI (Message Passing Interface) is used to exploit the parallelism of the present code. The riser sections are equally divided among the nodes of the cluster. Each node solves the hydrodynamic sections assigned to it. The forces acting on the sections are then passed to the master processor, which is responsible for the calculation of the displacement of the whole structure. The time histories of stress are employed to evaluate the damage as well as the life expectancy of the structure by the rainflow method to count the cycles in the dynamic response.


Sign in / Sign up

Export Citation Format

Share Document