Simultaneous Solution Algorithms for Gas-Solid Flows: An Efficient Parallel Line Solver

Author(s):  
Juray De Wilde ◽  
Edward Baudrez ◽  
Geraldine J. Heynderickx ◽  
Jan Vierendeels ◽  
Denis Constales ◽  
...  

A pointwise simultaneous solution algorithm based on dual time stepping was developed by De Wilde et al. (2002). With increasing grid aspect ratios, the efficiency of the point method quickly drops. Most realistic flow cases, however, require high grid aspect ratio grids, with the highest grid spacing in the streamwise direction. In this direction, the stiffness is efficiently removed by applying preconditioning (Weiss and Smith, 1995). In the direction perpendicular to the stream wise direction, stiffness remains because of the viscous and the acoustic terms. To resolve this problem, a line method is presented. All nodes in a plane perpendicular to the stream wise direction, a so-called line, are solved simultaneously. This allows a fully implicit treatment of the fluxes in the line, removing the stiffness in the line wise directions. Calculations with different grid aspect ratios are presented to investigate the convergence behavior of the line method. The line method presented is particularly suited for parallelization. At each pseudo-time step, the lines (typically hundreds) can be solved independently of each other. The Message Passing Interface (MPI) standard (Snir et al., 1996) is used. The communication between the processors can be easily reduced by solving a block of lines per processor. The communication is then limited to information regarding only the outer lines of the block. In common practice, the number of lines is much higher than the number of processors available. In this region of the lines/processor space, the reduction of the calculation time is linear with the number of processors that is used.

Geophysics ◽  
2013 ◽  
Vol 78 (2) ◽  
pp. F7-F15 ◽  
Author(s):  
Robin M. Weiss ◽  
Jeffrey Shragge

Efficiently modeling seismic data sets in complex 3D anisotropic media by solving the 3D elastic wave equation is an important challenge in computational geophysics. Using a stress-stiffness formulation on a regular grid, we tested a 3D finite-difference time-domain solver using a second-order temporal and eighth-order spatial accuracy stencil that leverages the massively parallel architecture of graphics processing units (GPUs) to accelerate the computation of key kernels. The relatively small memory of an individual GPU limits the model domain sizes that can be computed on a single device. To circumvent this constraint and move toward modeling industry-sized 3D anisotropic elastic data sets, we parallelized computation across multiple GPU devices by using domain decomposition and, for each time step, employing an interdevice communication protocol to exchange data values falling within interior boundaries of each subdomain. For two or more GPU devices within a single compute node, we use direct peer-to-peer (i.e., GPU-to-GPU) communication, whereas for networked nodes we employed message-passing interface directives to route data over the network. Our 2D GPU-based anisotropic elastic modeling tests achieved a [Formula: see text] speedup relative to an OpenMP CPU implementation run on an eight-core machine, whereas our 3D tests using dual-GPU devices produced up to a [Formula: see text] speedup. The performance boost afforded by the GPU architecture allowed us to model seismic data for 3D anisotropic elastic models at lower hardware cost and in less time than has been previously possible.


Author(s):  
Yu Liu ◽  
Michael Nishimura ◽  
Marat Seydaliev ◽  
Markus Piro

Recent trends in nuclear reactor performance and safety analyses increasingly rely on multiscale multiphysics computer simulations to enhance predictive capabilities by replacing conventional methods that are largely empirically based with a more scientifically based methodology. Through this approach, one addresses the issue of traditionally employing a suite of stand-alone codes that independently simulate various physical phenomena that were previously disconnected. Multiple computer simulations of different phenomena must exchange data during runtime to address these interdependencies. Previously, recommendations have been made regarding various approaches for piloting different design options of data coupling for multiphysics systems (Seydaliev and Caswell, 2014, “CORBA and MPI Based “Backbone” for Coupling Advanced Simulation Tools,” AECL Nucl. Rev., 3(2), pp. 83–90). This paper describes progress since the initial pilot study that outlined the implementation and execution of a new distribution framework, referred to as “Backbone,” to provide the necessary runtime exchange of data between different codes. The Backbone, currently under development at the Canadian Nuclear Laboratories (CNL), is a hybrid design using both common object request broker architecture (CORBA) and message passing interface (MPI) systems. This paper also presents two preliminary cases for coupling existing nuclear performance and safety analysis codes used for simulating fuel behavior, fission product release, thermal hydraulics, and neutron transport through the Backbone. Additionally, a pilot study presents a few strategies of a new time step controller (TSC) to synchronize the codes coupled through the Backbone. A performance and fidelity comparison is presented between a simple heuristic method for determining time step length and a more advanced third-order method, which was selected to maximize configurability and effectiveness of temporal integration, saving time steps and reducing wasted computation. The net effect of the foregoing features of the Backbone is to provide a practical toolset to couple existing and newly developed codes—which may be written in different programming languages and used on different operating systems—with minimal programming effort to enhance predictions of nuclear reactor performance and safety.


2021 ◽  
Author(s):  
Giorgio Micaletto ◽  
Ivano Barletta ◽  
Silvia Mocavero ◽  
Ivan Federico ◽  
Italo Epicoco ◽  
...  

Abstract. This paper presents the MPI-based parallelization of the three-dimensional hydrodynamic model SHYFEM (System of HydrodYnamic Finite Element Modules). The original sequential version of the code was parallelized in order to reduce the execution time of high-resolution configurations using state-of-the-art HPC systems. A distributed memory approach was used, based on the message passing interface (MPI). Optimized numerical libraries were used to partition the unstructured grid (with a focus on load balancing) and to solve the sparse linear system of equations in parallel in the case of semi-to-fully implicit time stepping. The parallel implementation of the model was validated by comparing the outputs with those obtained from the sequential version. The performance assessment demonstrates a good level of scalability with a realistic configuration used as benchmark.


1973 ◽  
Vol 13 (06) ◽  
pp. 311-320 ◽  
Author(s):  
F. Sonier ◽  
O. Ombret

Abstract This paper describes a two-dimensional three-phase numerical model for simulating two- or three-phase coning behavior. The model is fully implicit with respect to all variables and uses the simultaneous solution of the different equations describing multiphase flow. For determining well flow rates from all blocks communicating with the well, particular attention has been paid to the well boundary condition, which is considered to be a physical boundary. The mathematical expression of these well conditions enables flow rates to be calculated in a perfectly implicit manner and thus makes the model very stable so that the computational error in time is very small. The model described is appreciably different in this respect from previous models in which the well is represented by source points and in which the flour terms are calculated by using various simplifications. The results of several tests are presented. The model was checked by the simulation of several water coning cases that had previously been studied on a physical model. Four examples are given here. In these examples, the boundary influx conditions and fluid mobility ratio are made to vary. One of them illustrates the care that must be taken when using simplified solution schemes for the boundary conditions. Introduction Multiphase numerical models have usually employed finite-difference approximations in which relative permeabilities are evaluated explicitly at the beginning of each time step. But simulators of this type are not capable of solving problems characterized by high flow velocities and such phenomena as well coning, except perhaps at phenomena as well coning, except perhaps at extremely high cost. Recently, some papers were published describing a method that employs semi-implicit relative permeabilities and uses the simultaneous solution of multiphase equations. This method is very efficient. In these simulators, the well is represented by source points, and flow rate terms are calculated by using various simplifications (mobility or potential methods). potential methods). This paper describes a new numerical coning model. The numerical part of the model is similar to that in the latest models, but its representation of wellbore conditions is quite different and more nearly expresses physical phenomena caused by end effects. The well is represented full-scale and not by source points. Furthermore, so as not to partially screen out wellbore conditions, the partially screen out wellbore conditions, the producing interval, even if it is small, may be producing interval, even if it is small, may be advantageously represented by several layers. Any condition may be specified for the external boundaries. All the leading physical parameters are treated semi-implicitly. When a flow rate is imposed on the well, taking into account the well-wall boundary conditions, the calculation of production terms is fully implicit. This calculation is iterative, but at almost each time step a simple algorithm enables a direct solution to be obtained. The results of numerous simulations are presented. Studies on physical models have demonstrated the full validity of the numerical model. The simulation of actual field cases shows that the model is very efficient. CONING MODEL The numerical model described in this paper is a two-dimensional one with radial symmetry. A compressible three-phase system is considered, with possible exchange between the gas and oil phases independently of the composition. phases independently of the composition. The introduction of Darcy's law into the continuity equation for each of the three fluids leads to a system of partial differential equations. SPEJ p. 311


2014 ◽  
Vol 20 (2) ◽  
pp. 237-256 ◽  
Author(s):  
Junfu Fan ◽  
Min Ji ◽  
Guomin Gu ◽  
Yong Sun

On buffer zone construction, the rasterization-based dilation method inevitably introduces errors, and the double-sided parallel line method involves a series of complex operations. In this paper, we proposed a parallel buffer algorithm based on area merging and MPI (Message Passing Interface) to improve the performances of buffer analyses on processing large datasets. Experimental results reveal that there are three major performance bottlenecks which significantly impact the serial and parallel buffer construction efficiencies, including the area merging strategy, the task load balance method and the MPI inter-process results merging strategy. Corresponding optimization approaches involving tree-like area merging strategy, the vertex number oriented parallel task partition method and the inter-process results merging strategy were suggested to overcome these bottlenecks. Experiments were carried out to examine the performance efficiency of the optimized parallel algorithm. The estimation results suggested that the optimization approaches could provide high performance and processing ability for buffer construction in a cluster parallel environment. Our method could provide insights into the parallelization of spatial analysis algorithm.


The numerical solution of the heat equation in one space dimension is obtained using the Fourth-Order Iterative Alternating Decomposition Explicit Method (4-IADE) on a parallel platform with Message Passing Interface (MPI). Here, a higher fourth-order Crank-Nicolson type scheme is used in the approximation which gives rise to a Penta diagonal matrix in the solution of the system at each time level. The method employs a splitting strategy which is applied alternately at each half time step. The method is shown to be computationally stable and appropriate parameters chosen to accelerate convergence. The accuracy of the method is comparable to that of existing well known methods. Results obtained by this method for several different problems were compared with the exact solution and agreed closely with those obtained by other finite-difference methods with correlation between speedup and efficiency


2021 ◽  
Vol 14 (12) ◽  
pp. 7477-7495
Author(s):  
Rafael Lago ◽  
Thomas Gastine ◽  
Tilman Dannert ◽  
Markus Rampp ◽  
Johannes Wicht

Abstract. We discuss two parallelization schemes for MagIC, an open-source, high-performance, pseudo-spectral code for the numerical solution of the magnetohydrodynamics equations in a rotating spherical shell. MagIC calculates the non-linear terms on a numerical grid in spherical coordinates, while the time step updates are performed on radial grid points with a spherical harmonic representation of the lateral directions. Several transforms are required to switch between the different representations. The established hybrid parallelization of MagIC uses message-passing interface (MPI) distribution in radius and relies on existing fast spherical transforms using OpenMP. Our new two-dimensional MPI decomposition implementation also distributes the latitudes or the azimuthal wavenumbers across the available MPI tasks and compute cores. We discuss several non-trivial algorithmic optimizations and the different data distribution layouts employed by our scheme. In particular, the two-dimensional distribution data layout yields a code that strongly scales well beyond the limit of the current one-dimensional distribution. We also show that the two-dimensional distribution implementation, although not yet fully optimized, can already be faster than the existing finely optimized hybrid parallelization when using many thousands of CPU cores. Our analysis indicates that the two-dimensional distribution variant can be further optimized to also surpass the performance of the one-dimensional distribution for a few thousand cores.


Computation ◽  
2020 ◽  
Vol 8 (4) ◽  
pp. 84
Author(s):  
Gokhan Kirkil

We propose a method to parallelize a 3D incompressible Navier–Stokes solver that uses a fully implicit fractional-step method to simulate sediment transport in prismatic channels. The governing equations are transformed into generalized curvilinear coordinates on a non-staggered grid. To develop a parallel version of the code that can run on various platforms, in particular on PC clusters, it was decided to parallelize the code using Message Passing Interface (MPI) which is one of the most flexible parallel programming libraries. Code parallelization is accomplished by “message passing” whereby the computer explicitly uses library calls to accomplish communication between the individual processors of the machine (e.g., PC cluster). As a part of the parallelization effort, besides the Navier–Stokes solver, the deformable bed module used in simulations with loose beds are also parallelized. The flow, sediment transport, and bathymetry at equilibrium conditions were computed with the parallel and serial versions of the code for the case of a 140-degree curved channel bend of rectangular section. The parallel simulation conducted on eight processors gives exactly the same results as the serial solver. The parallel version of the solver showed good scalability.


2020 ◽  
Vol 15 ◽  
Author(s):  
Weiwen Zhang ◽  
Long Wang ◽  
Theint Theint Aye ◽  
Juniarto Samsudin ◽  
Yongqing Zhu

Background: Genotype imputation as a service is developed to enable researchers to estimate genotypes on haplotyped data without performing whole genome sequencing. However, genotype imputation is computation intensive and thus it remains a challenge to satisfy the high performance requirement of genome wide association study (GWAS). Objective: In this paper, we propose a high performance computing solution for genotype imputation on supercomputers to enhance its execution performance. Method: We design and implement a multi-level parallelization that includes job level, process level and thread level parallelization, enabled by job scheduling management, message passing interface (MPI) and OpenMP, respectively. It involves job distribution, chunk partition and execution, parallelized iteration for imputation and data concatenation. Due to the design of multi-level parallelization, we can exploit the multi-machine/multi-core architecture to improve the performance of genotype imputation. Results: Experiment results show that our proposed method can outperform the Hadoop-based implementation of genotype imputation. Moreover, we conduct the experiments on supercomputers to evaluate the performance of the proposed method. The evaluation shows that it can significantly shorten the execution time, thus improving the performance for genotype imputation. Conclusion: The proposed multi-level parallelization, when deployed as an imputation as a service, will facilitate bioinformatics researchers in Singapore to conduct genotype imputation and enhance the association study.


Water ◽  
2020 ◽  
Vol 12 (6) ◽  
pp. 1639
Author(s):  
Abdelkrim Aharmouch ◽  
Brahim Amaziane ◽  
Mustapha El Ossmani ◽  
Khadija Talali

We present a numerical framework for efficiently simulating seawater flow in coastal aquifers using a finite volume method. The mathematical model consists of coupled and nonlinear partial differential equations. Difficulties arise from the nonlinear structure of the system and the complexity of natural fields, which results in complex aquifer geometries and heterogeneity in the hydraulic parameters. When numerically solving such a model, due to the mentioned feature, attempts to explicitly perform the time integration result in an excessively restricted stability condition on time step. An implicit method, which calculates the flow dynamics at each time step, is needed to overcome the stability problem of the time integration and mass conservation. A fully implicit finite volume scheme is developed to discretize the coupled system that allows the use of much longer time steps than explicit schemes. We have developed and implemented this scheme in a new module in the context of the open source platform DuMu X . The accuracy and effectiveness of this new module are demonstrated through numerical investigation for simulating the displacement of the sharp interface between saltwater and freshwater in groundwater flow. Lastly, numerical results of a realistic test case are presented to prove the efficiency and the performance of the method.


Sign in / Sign up

Export Citation Format

Share Document