Backbone: A Multiphysics Framework for Coupling Nuclear Codes Based on CORBA and MPI

Recent trends in nuclear reactor performance and safety analyses increasingly rely on multiscale multiphysics computer simulations to enhance predictive capabilities by replacing conventional methods that are largely empirically based with a more scientifically based methodology. Through this approach, one addresses the issue of traditionally employing a suite of stand-alone codes that independently simulate various physical phenomena that were previously disconnected. Multiple computer simulations of different phenomena must exchange data during runtime to address these interdependencies. Previously, recommendations have been made regarding various approaches for piloting different design options of data coupling for multiphysics systems (Seydaliev and Caswell, 2014, “CORBA and MPI Based “Backbone” for Coupling Advanced Simulation Tools,” AECL Nucl. Rev., 3(2), pp. 83–90). This paper describes progress since the initial pilot study that outlined the implementation and execution of a new distribution framework, referred to as “Backbone,” to provide the necessary runtime exchange of data between different codes. The Backbone, currently under development at the Canadian Nuclear Laboratories (CNL), is a hybrid design using both common object request broker architecture (CORBA) and message passing interface (MPI) systems. This paper also presents two preliminary cases for coupling existing nuclear performance and safety analysis codes used for simulating fuel behavior, fission product release, thermal hydraulics, and neutron transport through the Backbone. Additionally, a pilot study presents a few strategies of a new time step controller (TSC) to synchronize the codes coupled through the Backbone. A performance and fidelity comparison is presented between a simple heuristic method for determining time step length and a more advanced third-order method, which was selected to maximize configurability and effectiveness of temporal integration, saving time steps and reducing wasted computation. The net effect of the foregoing features of the Backbone is to provide a practical toolset to couple existing and newly developed codes—which may be written in different programming languages and used on different operating systems—with minimal programming effort to enhance predictions of nuclear reactor performance and safety.

Download Full-text

CORBA AND MPI-BASED “BACKBONE” FOR COUPLING ADVANCED SIMULATION TOOLS

AECL Nuclear Review ◽

10.12943/anr.2014.00036 ◽

2014 ◽

Vol 3 (2) ◽

pp. 83-90 ◽

Cited By ~ 1

Author(s):

M. Seydaliev ◽

D. Caswell

Keyword(s):

Fission Product ◽

Computer Simulations ◽

Message Passing ◽

Message Passing Interface ◽

Nuclear Reactor ◽

Atmospheric Dispersion ◽

Safety Analysis ◽

Severe Accident ◽

Nuclear Reactor Safety ◽

Exchange Data

There is a growing international interest in using coupled, multidisciplinary computer simulations for a variety of purposes, including nuclear reactor safety analysis. Reactor behaviour can be modeled using a suite of computer programs simulating phenomena or predicting parameters that can be categorized into disciplines such as Thermalhydraulics, Neutronics, Fuel, Fuel Channels, Fission Product Release and Transport, Containment and Atmospheric Dispersion, and Severe Accident Analysis. Traditionally, simulations used for safety analysis individually addressed only the behaviour within a single discipline, based upon static input data from other simulation programs. The limitation of using a suite of stand-alone simulations is that phenomenological interdependencies or temporal feedback between the parameters calculated within individual simulations cannot be adequately captured. To remove this shortcoming, multiple computer simulations for different disciplines must exchange data during runtime to address these interdependencies. This article describes the concept of a new framework, which we refer to as the “Backbone,” to provide the necessary runtime exchange of data. The Backbone, currently under development at AECL for a preliminary feasibility study, is a hybrid design using features taken from the Common Object Request Broker Architecture (CORBA), a standard defined by the Object Management Group, and the Message Passing Interface (MPI), a standard developed by a group of researchers from academia and industry. Both have well-tested and efficient implementations, including some that are freely available under the GNU public licenses. The CORBA component enables individual programs written in different languages and running on different platforms within a network to exchange data with each other, thus behaving like a single application. MPI provides the process-to-process intercommunication between these programs. This paper outlines the different CORBA and MPI configurations examined to date, as well as the preliminary configuration selected for coupling 2 existing safety analysis programs used for modeling thermal–mechanical fuel behavior and fission product behavior respectively. In addition, preliminary work in hosting both the Backbone and the associated safety analysis programs in a cluster environment are discussed.

Download Full-text

Solving 3D anisotropic elastic wave equations on parallel GPU devices

Geophysics ◽

10.1190/geo2012-0063.1 ◽

2013 ◽

Vol 78 (2) ◽

pp. F7-F15 ◽

Cited By ~ 57

Author(s):

Robin M. Weiss ◽

Jeffrey Shragge

Keyword(s):

Elastic Wave ◽

Seismic Data ◽

Message Passing ◽

Graphics Processing Units ◽

Message Passing Interface ◽

Parallel Architecture ◽

Data Sets ◽

Eighth Order ◽

Time Step ◽

Anisotropic Elastic

Efficiently modeling seismic data sets in complex 3D anisotropic media by solving the 3D elastic wave equation is an important challenge in computational geophysics. Using a stress-stiffness formulation on a regular grid, we tested a 3D finite-difference time-domain solver using a second-order temporal and eighth-order spatial accuracy stencil that leverages the massively parallel architecture of graphics processing units (GPUs) to accelerate the computation of key kernels. The relatively small memory of an individual GPU limits the model domain sizes that can be computed on a single device. To circumvent this constraint and move toward modeling industry-sized 3D anisotropic elastic data sets, we parallelized computation across multiple GPU devices by using domain decomposition and, for each time step, employing an interdevice communication protocol to exchange data values falling within interior boundaries of each subdomain. For two or more GPU devices within a single compute node, we use direct peer-to-peer (i.e., GPU-to-GPU) communication, whereas for networked nodes we employed message-passing interface directives to route data over the network. Our 2D GPU-based anisotropic elastic modeling tests achieved a [Formula: see text] speedup relative to an OpenMP CPU implementation run on an eight-core machine, whereas our 3D tests using dual-GPU devices produced up to a [Formula: see text] speedup. The performance boost afforded by the GPU architecture allowed us to model seismic data for 3D anisotropic elastic models at lower hardware cost and in less time than has been previously possible.

Download Full-text

Simultaneous Solution Algorithms for Gas-Solid Flows: An Efficient Parallel Line Solver

Computational Technologies for Fluid/Thermal/Structural/Chemical Systems With Industrial Applications, Volume 2 ◽

10.1115/pvp2004-3116 ◽

2004 ◽

Author(s):

Juray De Wilde ◽

Edward Baudrez ◽

Geraldine J. Heynderickx ◽

Jan Vierendeels ◽

Denis Constales ◽

...

Keyword(s):

Message Passing ◽

Message Passing Interface ◽

Parallel Line ◽

Simultaneous Solution ◽

Time Step ◽

Line Method ◽

Solution Algorithms ◽

Fully Implicit ◽

Aspect Ratios ◽

Dual Time Stepping

A pointwise simultaneous solution algorithm based on dual time stepping was developed by De Wilde et al. (2002). With increasing grid aspect ratios, the efficiency of the point method quickly drops. Most realistic flow cases, however, require high grid aspect ratio grids, with the highest grid spacing in the streamwise direction. In this direction, the stiffness is efficiently removed by applying preconditioning (Weiss and Smith, 1995). In the direction perpendicular to the stream wise direction, stiffness remains because of the viscous and the acoustic terms. To resolve this problem, a line method is presented. All nodes in a plane perpendicular to the stream wise direction, a so-called line, are solved simultaneously. This allows a fully implicit treatment of the fluxes in the line, removing the stiffness in the line wise directions. Calculations with different grid aspect ratios are presented to investigate the convergence behavior of the line method. The line method presented is particularly suited for parallelization. At each pseudo-time step, the lines (typically hundreds) can be solved independently of each other. The Message Passing Interface (MPI) standard (Snir et al., 1996) is used. The communication between the processors can be easily reduced by solving a block of lines per processor. The communication is then limited to information regarding only the outer lines of the block. In common practice, the number of lines is much higher than the number of processors available. In this region of the lines/processor space, the reduction of the calculation time is linear with the number of processors that is used.

Download Full-text

The Numerical Solution of Heat Equation by the Fourth-Order Iterative Alternating Decomposition Explicit Method with MPI

International Journal of Engineering and Advanced Technology - Regular Issue ◽

10.35940/ijeat.f1002.0986s319 ◽

2019 ◽

Vol 8 (6S3) ◽

pp. 4-9

Keyword(s):

Heat Equation ◽

Numerical Solution ◽

Fourth Order ◽

Message Passing ◽

Message Passing Interface ◽

Finite Difference Methods ◽

Explicit Method ◽

Time Step ◽

Accelerate Convergence ◽

Parallel Platform

The numerical solution of the heat equation in one space dimension is obtained using the Fourth-Order Iterative Alternating Decomposition Explicit Method (4-IADE) on a parallel platform with Message Passing Interface (MPI). Here, a higher fourth-order Crank-Nicolson type scheme is used in the approximation which gives rise to a Penta diagonal matrix in the solution of the system at each time level. The method employs a splitting strategy which is applied alternately at each half time step. The method is shown to be computationally stable and appropriate parameters chosen to accelerate convergence. The accuracy of the method is comparable to that of existing well known methods. Results obtained by this method for several different problems were compared with the exact solution and agreed closely with those obtained by other finite-difference methods with correlation between speedup and efficiency

Download Full-text

MagIC v5.10: a two-dimensional message-passing interface (MPI) distribution for pseudo-spectral magnetohydrodynamics simulations in spherical geometry

Geoscientific Model Development ◽

10.5194/gmd-14-7477-2021 ◽

2021 ◽

Vol 14 (12) ◽

pp. 7477-7495

Author(s):

Rafael Lago ◽

Thomas Gastine ◽

Tilman Dannert ◽

Markus Rampp ◽

Johannes Wicht

Keyword(s):

Message Passing ◽

Message Passing Interface ◽

Distribution Data ◽

Two Dimensional ◽

Data Layout ◽

Time Step ◽

One Dimensional ◽

Hybrid Parallelization ◽

Dimensional Distribution ◽

Pseudo Spectral

Abstract. We discuss two parallelization schemes for MagIC, an open-source, high-performance, pseudo-spectral code for the numerical solution of the magnetohydrodynamics equations in a rotating spherical shell. MagIC calculates the non-linear terms on a numerical grid in spherical coordinates, while the time step updates are performed on radial grid points with a spherical harmonic representation of the lateral directions. Several transforms are required to switch between the different representations. The established hybrid parallelization of MagIC uses message-passing interface (MPI) distribution in radius and relies on existing fast spherical transforms using OpenMP. Our new two-dimensional MPI decomposition implementation also distributes the latitudes or the azimuthal wavenumbers across the available MPI tasks and compute cores. We discuss several non-trivial algorithmic optimizations and the different data distribution layouts employed by our scheme. In particular, the two-dimensional distribution data layout yields a code that strongly scales well beyond the limit of the current one-dimensional distribution. We also show that the two-dimensional distribution implementation, although not yet fully optimized, can already be faster than the existing finely optimized hybrid parallelization when using many thousands of CPU cores. Our analysis indicates that the two-dimensional distribution variant can be further optimized to also surpass the performance of the one-dimensional distribution for a few thousand cores.

Download Full-text

Multi-level Parallelization of Genotype Imputation on Supercomputers

Current Bioinformatics ◽

10.2174/1574893615999200420071307 ◽

2020 ◽

Vol 15 ◽

Author(s):

Weiwen Zhang ◽

Long Wang ◽

Theint Theint Aye ◽

Juniarto Samsudin ◽

Yongqing Zhu

Keyword(s):

Association Study ◽

Message Passing ◽

High Performance ◽

Message Passing Interface ◽

Genome Wide Association Study ◽

Job Scheduling ◽

Genotype Imputation ◽

Job Level ◽

Multi Level ◽

High Performance Requirement

Background: Genotype imputation as a service is developed to enable researchers to estimate genotypes on haplotyped data without performing whole genome sequencing. However, genotype imputation is computation intensive and thus it remains a challenge to satisfy the high performance requirement of genome wide association study (GWAS). Objective: In this paper, we propose a high performance computing solution for genotype imputation on supercomputers to enhance its execution performance. Method: We design and implement a multi-level parallelization that includes job level, process level and thread level parallelization, enabled by job scheduling management, message passing interface (MPI) and OpenMP, respectively. It involves job distribution, chunk partition and execution, parallelized iteration for imputation and data concatenation. Due to the design of multi-level parallelization, we can exploit the multi-machine/multi-core architecture to improve the performance of genotype imputation. Results: Experiment results show that our proposed method can outperform the Hadoop-based implementation of genotype imputation. Moreover, we conduct the experiments on supercomputers to evaluate the performance of the proposed method. The evaluation shows that it can significantly shorten the execution time, thus improving the performance for genotype imputation. Conclusion: The proposed multi-level parallelization, when deployed as an imputation as a service, will facilitate bioinformatics researchers in Singapore to conduct genotype imputation and enhance the association study.

Download Full-text

Distributed Singular Value Decomposition Method for Fast Data Processing in Recommendation Systems

Energies ◽

10.3390/en14082284 ◽

2021 ◽

Vol 14 (8) ◽

pp. 2284

Author(s):

Krzysztof Przystupa ◽

Mykola Beshley ◽

Olena Hordiichuk-Bublivska ◽

Marian Kyryk ◽

Halyna Beshley ◽

...

Keyword(s):

Distributed Systems ◽

Singular Value Decomposition ◽

Data Processing ◽

Message Passing ◽

Message Passing Interface ◽

Recommendation Systems ◽

Singular Value ◽

Singular Value Decomposition Method ◽

Value Decomposition ◽

Svd Method

The problem of analyzing a big amount of user data to determine their preferences and, based on these data, to provide recommendations on new products is important. Depending on the correctness and timeliness of the recommendations, significant profits or losses can be obtained. The task of analyzing data on users of services of companies is carried out in special recommendation systems. However, with a large number of users, the data for processing become very big, which causes complexity in the work of recommendation systems. For efficient data analysis in commercial systems, the Singular Value Decomposition (SVD) method can perform intelligent analysis of information. With a large amount of processed information we proposed to use distributed systems. This approach allows reducing time of data processing and recommendations to users. For the experimental study, we implemented the distributed SVD method using Message Passing Interface, Hadoop and Spark technologies and obtained the results of reducing the time of data processing when using distributed systems compared to non-distributed ones.

Download Full-text