Three-dimensional deformable-grid 
electromagnetic particle-in-cell for parallel 
computers

We describe a new parallel, non-orthogonal-grid, three-dimensional electromagnetic particle-in-cell (EMPIC) code based on a finite-volume formulation. This code uses a logically Cartesian grid of deformable hexahedral cells, a discrete surface integral (DSI) algorithm to calculate the electromagnetic field, and a hybrid logical–physical space algorithm to push particles. We investigate the numerical instability of the DSI algorithm for non-orthogonal grids, analyse the accuracy for EMPIC simulations on non-orthogonal grids, and present performance benchmarks of this code on a parallel supercomputer. While the hybrid particle push algorithm has a second-order accuracy in space, the accuracy of the DSI field solve algorithm is between first and second order for non-orthogonal grids. The parallel implementation of this code, which is almost identical to that of a Cartesian-grid EMPIC code using domain decomposition, achieved a high parallel efficiency of over 96% for large-scale simulations.

Download Full-text

Assembling of Parallel Programs for Large Scale Numerical Modeling

Computer Engineering ◽

10.4018/978-1-61350-456-7.ch301 ◽

2012 ◽

pp. 497-511

Author(s):

V.E. Malyshkin

Keyword(s):

Large Scale ◽

Parallel Implementation ◽

Numerical Models ◽

Dynamic Properties ◽

Parallel Program ◽

Particle In Cell ◽

Modular Programming ◽

Rectangular Mesh ◽

Assembly Technology ◽

Main Ideas

The main ideas of the Assembly Technology (AT) in its application to parallel implementation of large scale realistic numerical models on a rectangular mesh are considered and demonstrated by the parallelization (fragmentation) of the Particle-In-Cell method (PIC) application to solution of the problem of energy exchange in plasma cloud. The implementation of the numerical models with the assembly technology is based on the construction of a fragmented parallel program. Assembling of a numerical simulation program under AT provides automatically different useful dynamic properties of the target program including dynamic load balance on the basis of the fragments migration from overloaded into underloaded processor elements of a multicomputer. Parallel program assembling approach also can be considered as combination and adaptation for parallel programming of the well known modular programming and domain decomposition techniques and supported by the system software for fragmented programs assembling.

Download Full-text

Implementation of 2D Domain Decomposition in the UCAN Gyrokinetic Particle-in-Cell Code and Resulting Performance of UCAN2

Communications in Computational Physics ◽

10.4208/cicp.070115.030715a ◽

2016 ◽

Vol 19 (1) ◽

pp. 205-225 ◽

Cited By ~ 2

Author(s):

Jean-Noel G. Leboeuf ◽

Viktor K. Decyk ◽

David E. Newman ◽

Raul Sanchez

Keyword(s):

Domain Decomposition ◽

Parallel Implementation ◽

Three Dimensional ◽

Massively Parallel ◽

Problem Size ◽

Time Step ◽

Particle In Cell ◽

Minor Radius ◽

Efficient Charge ◽

Cartesian Geometry

AbstractThe massively parallel, nonlinear, three-dimensional (3D), toroidal, electrostatic, gyrokinetic, particle-in-cell (PIC), Cartesian geometry UCAN code, with particle ions and adiabatic electrons, has been successfully exercised to identify non-diffusive transport characteristics in present day tokamak discharges. The limitation in applying UCAN to larger scale discharges is the 1D domain decomposition in the toroidal (or z-) direction for massively parallel implementation using MPI which has restricted the calculations to a few hundred ion Larmor radii or gyroradii per plasma minor radius. To exceed these sizes, we have implemented 2D domain decomposition in UCAN with the addition of the y-direction to the processor mix. This has been facilitated by use of relevant components in the P2LIB library of field and particle management routines developed for UCLA's UPIC Framework of conventional PIC codes. The gyro-averaging specific to gyrokinetic codes is simplified by the use of replicated arrays for efficient charge accumulation and force deposition. The 2D domain-decomposed UCAN2 code reproduces the original 1D domain nonlinear results within round-off. Benchmarks of UCAN2 on the Cray XC30 Edison at NERSC demonstrate ideal scaling when problem size is increased along with processor number up to the largest power of 2 available, namely 131,072 processors. These particle weak scaling benchmarks also indicate that the 1 nanosecond per particle per time step and 1 TFlops barriers are easily broken by UCAN2 with 1 billion particles or more and 2000 or more processors.

Download Full-text

Efficient Parallel Algorithms for 3D Laplacian Smoothing on the GPU

Applied Sciences ◽

10.3390/app9245437 ◽

2019 ◽

Vol 9 (24) ◽

pp. 5437

Author(s):

Lei Xiao ◽

Guoxiang Yang ◽

Kunyang Zhao ◽

Gang Mei

Keyword(s):

Large Scale ◽

Graphics Processing Unit ◽

Parallel Implementation ◽

Three Dimensional ◽

Smoothing Method ◽

Three Dimensions ◽

Processing Unit ◽

Mesh Quality ◽

Mesh Smoothing ◽

Laplacian Smoothing

In numerical modeling, mesh quality is one of the decisive factors that strongly affects the accuracy of calculations and the convergence of iterations. To improve mesh quality, the Laplacian mesh smoothing method, which repositions nodes to the barycenter of adjacent nodes without changing the mesh topology, has been widely used. However, smoothing a large-scale three dimensional mesh is quite computationally expensive, and few studies have focused on accelerating the Laplacian mesh smoothing method by utilizing the graphics processing unit (GPU). This paper presents a GPU-accelerated parallel algorithm for Laplacian smoothing in three dimensions by considering the influence of different data layouts and iteration forms. To evaluate the efficiency of the GPU implementation, the parallel solution is compared with the original serial solution. Experimental results show that our parallel implementation is up to 46 times faster than the serial version.

Download Full-text

ELBA (electron beams in accelerators) particle simulation code

Laser and Particle Beams ◽

10.1017/s0263034600007734 ◽

1994 ◽

Vol 12 (2) ◽

pp. 273-282 ◽

Cited By ~ 4

Author(s):

Glenn Joyce ◽

Jonathan Krall ◽

Steven Slinker

Keyword(s):

Large Scale ◽

Electron Beams ◽

Three Dimensional ◽

Forward Direction ◽

Laboratory Frame ◽

Particle Simulation ◽

Simulation Code ◽

Particle In Cell ◽

Charged Particle Beams ◽

Cell Simulation

ELBA is a three-dimensional, particle-in-cell, simulation code that has been developed to study the propagation and transport of relativistic charged particle beams. The code is particularly suited to the simulation of relativistic electron beams propagating through collisionless or slightly collisional plasmas or through external electric or magnetic fields. Particle motion is followed via a coordinate “window” in the laboratory frame that moves at the speed of light. This scheme allows us to model only the immediate vicinity of the beam. Because no information can move in the forward direction in these coordinates, particle and field data can be handled in a simple way that allows for very large scale simulations. A mapping scheme has been implemented that, with corrections to Maxwell's equations, allows the inclusion of bends in the simulation system.

Download Full-text

Electric Propulsion Plume Simulations Using Parallel Computer

Scientific Programming ◽

10.1155/2007/272431 ◽

2007 ◽

Vol 15 (2) ◽

pp. 83-94 ◽

Cited By ~ 1

Author(s):

Joseph Wang ◽

Yong Cao ◽

Raed Kafafy ◽

Viktor Decyk

Keyword(s):

Electric Propulsion ◽

Large Scale ◽

Three Dimensional ◽

Particle Simulation ◽

Parallel Computer ◽

Solar Array ◽

Electron Mass ◽

Ion Thruster ◽

Particle In Cell ◽

Pic Code

A parallel, three-dimensional electrostatic PIC code is developed for large-scale electric propulsion simulations using parallel supercomputers. This code uses a newly developed immersed-finite-element particle-in-cell (IFE-PIC) algorithm designed to handle complex boundary conditions accurately while maintaining the computational speed of the standard PIC code. Domain decomposition is used in both field solve and particle push to divide the computation among processors. Two simulations studies are presented to demonstrate the capability of the code. The first is a full particle simulation of near-thruster plume using real ion to electron mass ratio. The second is a high-resolution simulation of multiple ion thruster plume interactions for a realistic spacecraft using a domain enclosing the entire solar array panel. Performance benchmarks show that the IFE-PIC achieves a high parallel efficiency of ≥ 90%

Download Full-text

Dynamics of direct large-small scale couplings in coherently forced turbulence: concurrent physical- and Fourier-space views

Journal of Fluid Mechanics ◽

10.1017/s0022112095002230 ◽

1995 ◽

Vol 283 ◽

pp. 43-95 ◽

Cited By ~ 64

Author(s):

P. K. Yeung ◽

James G. Brasseur ◽

Qunzhen Wang

Keyword(s):

Long Range ◽

Large Scale ◽

Three Dimensional ◽

Physical Space ◽

Small Scale ◽

Fourier Space ◽

Distant Interactions ◽

Long Range Interactions ◽

Small Scales ◽

Non Local

As discussed in a recent paper by Brasseur & Wei (1994), scale interactions in fully developed turbulence are of two basic types in the Fourier-spectral view. The cascade of energy from large to small scales is embedded within ‘local-to-non-local’ triadic interactions separated in scale by a decade or less. ‘Distant’ triadic interactions between widely disparate scales transfer negligible energy between the largest and smallest scales, but directly modify the structure of the smallest scales in relationship to the structure of the energy-dominated large scales. Whereas cascading interactions tend to isotropize the small scales as energy moves through spectral shells from low to high wavenumbers, distant interactions redistribute energy within spectral shells in a manner that leads to anisotropic redistributions of small-scale energy and phase in response to anisotropic structure in the large scales. To study the role of long-range interactions in small-scale dynamics, Yeung & Brasseur (1991) carried out a numerical experiment in which the marginally distant triads were purposely stimulated through a coherent narrow-band anisotropic forcing at the large scales readily interpretable in both the Fourier- and physical-space views. It was found that, after one eddy turnover time, the smallest scales rapidly became anisotropic as a direct consequence of the marginally distant triadic group in a manner consistent with the distant triadic equations. Because these asymptotic equations apply in the infinite Reynolds number limit, Yeung & Brasseur argued that the observed long-range effects should be applicable also at high Reynolds numbers.We continue the analysis of forced simulations in this study, focusing (i) on the detailed three-dimensional restructuring of the small scales as predicted by the asymptotic triadic equations, and (ii) on the relationship between Fourier- and physical-space evolution during forcing. We show that the three-dimensional restructuring of small-scale energy and vorticity in Fourier space from large-scale forcing is predicted in some detail by the distant triadic equations. We find that during forcing the distant interactions alter small-scale structure in two ways: energy is redistributed anisotropically within high-wavenumber spectral shells, and phase correlations are established at the small scales by the distant interactions. In the numerical experiments, the long-range interactions create two pairs of localized volumes of concentrated energy in three-dimensional Fourier space at high wavenumbers in which the Fourier modes are phase coupled. Each pair of locally phase-correlated volumes of Fourier modes separately corresponds to aligned vortex tubes in physical space in two orthogonal directions. We show that the dynamics of distant interactions in creating small-scale anisotropy may be described in physical space by differential advection and distortion of small-scale vorticity by the coherent large-scale energy-containing eddies, producing anisotropic alignment of small-scale vortex tubes.Scaling arguments indicate a disparity in timescale between distant triadic interactions and energy-cascading local-to-non-local interactions which increases with scale separation. Consequently, the small scales respond to forcing initially through the distant interactions. However, as energy cascades from the large-scale to the small-scale Fourier modes, the stimulated distant interactions become embedded within a sea of local-to-non-local energy cascading interactions which reduce (but do not eliminate) small-scale anisotropy at later times. We find that whereas the small-scale structure is still anisotropic at these later times, the second-order velocity moment tensor is insensitive to this anisotropy. Third-order moments, on the other hand, do detect the anisotropy. We conclude that whereas a single statistical measure of anisotropy can be used to indicate the presence of anisotropy, a null result in that measure does not necessarily imply that the signal is isotropic. The results indicate that non-equilibrium non-stationary turbulence is particularly sensitive to long-range interactions and deviations from local isotropy.

Download Full-text

Accelerating Contaminant Transport Simulation in MT3DMS Using JASMIN-Based Parallel Computing

Water ◽

10.3390/w12051480 ◽

2020 ◽

Vol 12 (5) ◽

pp. 1480

Author(s):

Xingwei Liu ◽

Qiulan Zhang ◽

Tangpei Cheng

Keyword(s):

Contaminant Transport ◽

Large Scale ◽

Transport Model ◽

Parallel Implementation ◽

Domain Decomposition Method ◽

Three Dimensional ◽

Memory Consumption ◽

Structured Meshes ◽

Coupling Mode ◽

Solution Accuracy

To overcome the large time and memory consumption problems in large-scale high-resolution contaminant transport simulations, an efficient approach was presented to parallelize the modular three-dimensional transport model for multi-species (MT3DMS) (University of Alabama, Tuscaloosa, AL, USA) program on J adaptive structured meshes applications infrastructures (JASMIN). In this approach, a domain decomposition method and a stencil-based method were used to accomplish parallel implementation, while a ghost cell strategy was used for communication. The MODFLOW-MT3DMS coupling mode was optimized to achieve the parallel coupling of flow and contaminant transport. Five types of models were used to verify the correctness and test the parallel performance of the method. The developed parallel program JMT3D (China University of Geosciences (Beijing), Beijing, China) can increase the speed by up to 31.7 times, save memory consumption by 96% with 46 processors, and ensure that the solution accuracy and convergence do not decrease as the number of domains increases. The BiCGSTAB (Bi-conjugate gradient variant algorithm) method required the least amount of time and achieved high speedup in most cases. Coupling the flow and contaminant transport further improved the efficiency of the simulations, with a 33.45 times higher speedup achieved on 46 processors. The AMG (algebraic multigrid) method achieved a good scalability, with an efficiency above 100% on hundreds of processors for the simulation of tens of millions of cells.

Download Full-text

An Efficient Three-Dimensional FNPF Numerical Wave Tank for Large-Scale Wave Basin Experiment Simulation

Journal of Offshore Mechanics and Arctic Engineering ◽

10.1115/1.4007597 ◽

2013 ◽

Vol 135 (2) ◽

Cited By ~ 7

Author(s):

Seshu B. Nimmala ◽

Solomon C. Yim ◽

Stephan T. Grilli

Keyword(s):

High Performance ◽

Large Scale ◽

Parallel Implementation ◽

Mathematical Formulation ◽

Three Dimensional ◽

Numerical Wave Tank ◽

Wave Tank ◽

Wave Basin ◽

Numerical Wave ◽

Computing Platforms

This paper presents a parallel implementation and validation of an accurate and efficient three-dimensional computational model (3D numerical wave tank), based on fully nonlinear potential flow (FNPF) theory, and its extension to incorporate the motion of a laboratory snake piston wavemaker, as well as an absorbing beach, to simulate experiments in a large-scale 3D wave basin. This work is part of a long-term effort to develop a “virtual” computational wave basin to facilitate and complement large-scale physical wave-basin experiments. The code is based on a higher-order boundary-element method combined with a fast multipole algorithm (FMA). Particular efforts were devoted to making the code efficient for large-scale simulations using high-performance computing platforms. The numerical simulation capability can be tailored to serve as an optimization tool at the planning and detailed design stages of large-scale experiments at a specific basin by duplicating its exact physical and algorithmic features. To date, waves that can be generated in the numerical wave tank (NWT) include solitary, cnoidal, and airy waves. In this paper we detail the wave-basin model, mathematical formulation, wave generation, and analyze the performance of the parallelized FNPF-BEM-FMA code as a function of numerical parameters. Experimental or analytical comparisons with NWT results are provided for several cases to assess the accuracy and applicability of the numerical model to practical engineering problems.

Download Full-text