Parallel Performance of Hierarchical Multipole Algorithms for Inductance Extraction

Author(s):  
Hemant Mahawar ◽  
Vivek Sarin ◽  
Ananth Grama
2018 ◽  
Author(s):  
Pavel Pokhilko ◽  
Evgeny Epifanovsky ◽  
Anna I. Krylov

Using single precision floating point representation reduces the size of data and computation time by a factor of two relative to double precision conventionally used in electronic structure programs. For large-scale calculations, such as those encountered in many-body theories, reduced memory footprint alleviates memory and input/output bottlenecks. Reduced size of data can lead to additional gains due to improved parallel performance on CPUs and various accelerators. However, using single precision can potentially reduce the accuracy of computed observables. Here we report an implementation of coupled-cluster and equation-of-motion coupled-cluster methods with single and double excitations in single precision. We consider both standard implementation and one using Cholesky decomposition or resolution-of-the-identity of electron-repulsion integrals. Numerical tests illustrate that when single precision is used in correlated calculations, the loss of accuracy is insignificant and pure single-precision implementation can be used for computing energies, analytic gradients, excited states, and molecular properties. In addition to pure single-precision calculations, our implementation allows one to follow a single-precision calculation by clean-up iterations, fully recovering double-precision results while retaining significant savings.


1989 ◽  
Author(s):  
Edward A. Carmona ◽  
Michael D. Rice
Keyword(s):  

Author(s):  
Yasuhito Takahashi ◽  
Koji Fujiwara ◽  
Takeshi Iwashita ◽  
Hiroshi Nakashima

Purpose This paper aims to propose a parallel-in-space-time finite-element method (FEM) for transient motor starting analyses. Although the domain decomposition method (DDM) is suitable for solving large-scale problems and the parallel-in-time (PinT) integration method such as Parareal and time domain parallel FEM (TDPFEM) is effective for problems with a large number of time steps, their parallel performances get saturated as the number of processes increases. To overcome the difficulty, the hybrid approach in which both the DDM and PinT integration methods are used is investigated in a highly parallel computing environment. Design/methodology/approach First, the parallel performances of the DDM, Parareal and TDPFEM were compared because the scalability of these methods in highly parallel computation has not been deeply discussed. Then, the combination of the DDM and Parareal was investigated as a parallel-in-space-time FEM. The effectiveness of the developed method was demonstrated in transient starting analyses of induction motors. Findings The combination of Parareal with the DDM can improve the parallel performance in the case where the parallel performance of the DDM, TDPFEM or Parareal is saturated in highly parallel computation. In the case where the number of unknowns is large and the number of available processes is limited, the use of DDM is the most effective from the standpoint of computational cost. Originality/value This paper newly develops the parallel-in-space-time FEM and demonstrates its effectiveness in nonlinear magnetoquasistatic field analyses of electric machines. This finding is significantly important because a new direction of parallel computing techniques and great potential for its further development are clarified.


1991 ◽  
Vol 03 (01) ◽  
pp. 1-29 ◽  
Author(s):  
S. VANDEWALLE ◽  
R. VAN DRIESSCHE ◽  
R. PIESSENS
Keyword(s):  

2021 ◽  
Author(s):  
Jiecheng Zhang ◽  
George Moridis ◽  
Thomas Blasingame

Abstract The Reservoir GeoMechanics Simulator (RGMS), a geomechanics simulator based on the finite element method and parallelized using the Message Passing Interface (MPI), is developed in this work to model the stresses and deformations in subsurface systems. RGMS can be used stand-alone, or coupled with flow and transport models. pT+H V1.5, a parallel MPI-based version of the serial T+H V1.5 code that describes mass and heat flow in hydrate-bearing porous media, is also developed. Using the fixed-stress split iterative scheme, RGMS is coupled with the pT+H V1.5 to investigate the geomechanical responses associated with gas production from hydrate accumulations. The code development and testing process involve evaluation of the parallelization and of the coupling method, as well as verification and validation of the results. The parallel performance of the codes is tested on the Ada Linux cluster of the Texas A&M High Performance Research Computing using up to 512 processors, and on a Mac Pro computer with 12 processors. The investigated problems are: Group 1: Geomechanical problems solved by RGMS in 2D Cartesian and cylindrical domains and a 3D problem, involving 4x106 and 3.375 x106 elements, respectively; Group 2: Realistic problems of gas production from hydrates using pT+H V1.5 in 2D and 3D systems with 2.45x105 and 3.6 x106 elements, respectively; Group 3: The 3D problem in Group 2 solved with the coupled RGMS-pT+H V1.5 simulator, fully accounting for geomechanics. Two domain partitioning options are investigated on the Ada Linux cluster and the Mac Pro, and the code parallel performance is monitored. On the Ada Linux cluster using 512 processors, the simulation speedups (a) of RGMS are 218.89, 188.13, and 284.70 in the Group 1 problems, (b) of pT+H V1.5 are 174.25 and 341.67 in the Group 2 cases, and (c) of the coupled simulators is 331.80 in Group 3. The results produced in this work show the necessity of using full geomechanics simulators in marine hydrate-related studies because of the associated pronounced geomechanical effects on production and displacements and (b) the effectiveness of the parallel simulators developed in this study, which can be the only realistic option in these complex simulations of large multi-dimensional domains.


Sign in / Sign up

Export Citation Format

Share Document