A comparison of MPI and co-array FORTRAN for large finite element variably saturated flow simulations

The purpose of this research is to determine how well co-array FORTRAN (CAF) performs relative to Message Passing Interface (MPI) on unstructured mesh finite element groundwater modelling applications with large problem sizes and core counts. This research used almost 150 million nodes and 300 million 3-D prism elements. Results for both the Cray XE6 and Cray XC30 are given. A comparison of the ghost-node update algorithms with source code provided for both MPI and CAF is also presented.

Download Full-text

Based on Numerical Simulation of High-Performance Parallel Machine Muffler Experimental Calibration

Advanced Materials Research ◽

10.4028/www.scientific.net/amr.718-720.1645 ◽

2013 ◽

Vol 718-720 ◽

pp. 1645-1650

Author(s):

Gen Yin Cheng ◽

Sheng Chen Yu ◽

Zhi Yong Wei ◽

Shao Jie Chen ◽

You Cheng

Keyword(s):

Numerical Simulation ◽

Finite Element ◽

Boundary Element ◽

Message Passing ◽

High Performance ◽

Message Passing Interface ◽

Parallel Machine ◽

Simulation Software ◽

Experimental Calibration ◽

The Cost

Commonly used commercial simulation software SYSNOISE and ANSYS is run on a single machine (can not directly run on parallel machine) when use the finite element and boundary element to simulate muffler effect, and it will take more than ten days, sometimes even twenty days to work out an exact solution as the large amount of numerical simulation. Use a high performance parallel machine which was built by 32 commercial computers and transform the finite element and boundary element simulation software into a program that can running under the MPI (message passing interface) parallel environment in order to reduce the cost of numerical simulation. The relevant data worked out from the simulation experiment demonstrate that the result effect of the numerical simulation is well. And the computing speed of the high performance parallel machine is 25 ~ 30 times a microcomputer.

Download Full-text

A generic simulation cell method for developing extensible, efficient and readable parallel computational models

Geoscientific Model Development ◽

10.5194/gmd-8-473-2015 ◽

2015 ◽

Vol 8 (3) ◽

pp. 473-483

Author(s):

I. Honkonen

Keyword(s):

Message Passing ◽

Message Passing Interface ◽

Computational Models ◽

Source Code ◽

Computational Grid ◽

Communication Strategy ◽

Cell Method ◽

Parallel Performance ◽

Cell Class ◽

Simulation Cell

Abstract. I present a method for developing extensible and modular computational models without sacrificing serial or parallel performance or source code readability. By using a generic simulation cell method I show that it is possible to combine several distinct computational models to run in the same computational grid without requiring modification of existing code. This is an advantage for the development and testing of, e.g., geoscientific software as each submodel can be developed and tested independently and subsequently used without modification in a more complex coupled program. An implementation of the generic simulation cell method presented here, generic simulation cell class (gensimcell), also includes support for parallel programming by allowing model developers to select which simulation variables of, e.g., a domain-decomposed model to transfer between processes via a Message Passing Interface (MPI) library. This allows the communication strategy of a program to be formalized by explicitly stating which variables must be transferred between processes for the correct functionality of each submodel and the entire program. The generic simulation cell class requires a C++ compiler that supports a version of the language standardized in 2011 (C++11). The code is available at https://github.com/nasailja/gensimcell for everyone to use, study, modify and redistribute; those who do are kindly requested to acknowledge and cite this work.

Download Full-text

Model Order Reduction of Large-Scale Finite Element Systems in an MPI Parallelized Environment for Usage in Multibody Simulation

Archive of Mechanical Engineering ◽

10.1515/meceng-2016-0027 ◽

2016 ◽

Vol 63 (4) ◽

pp. 475-494 ◽

Cited By ~ 1

Author(s):

Thomas Volzer ◽

Peter Eberhard

Keyword(s):

Finite Element ◽

Model Reduction ◽

Message Passing ◽

Large Scale ◽

Message Passing Interface ◽

Block Size ◽

Reduction Process ◽

Element Model ◽

Multibody Simulation ◽

Elastic Bodies

Abstract The use of elastic bodies within a multibody simulation became more and more important within the last years. To include the elastic bodies, described as a finite element model in multibody simulations, the dimension of the system of ordinary differential equations must be reduced by projection. For this purpose, in this work, the modal reduction method, a component mode synthesis based method and a moment-matching method are used. Due to the always increasing size of the non-reduced systems, the calculation of the projection matrix leads to a large demand of computational resources and cannot be done on usual serial computers with available memory. In this paper, the model reduction software Morembs++ is presented using a parallelization concept based on the message passing interface to satisfy the need of memory and reduce the runtime of the model reduction process. Additionally, the behaviour of the Block-Krylov-Schur eigensolver, implemented in the Anasazi package of the Trilinos project, is analysed with regard to the choice of the size of the Krylov base, the block size and the number of blocks. Besides, an iterative solver is considered within the CMS-based method.

Download Full-text

Parallelization of a 3-Dimensional Hydrodynamics Model Using a Hybrid Method with MPI and OpenMP

Processes ◽

10.3390/pr9091548 ◽

2021 ◽

Vol 9 (9) ◽

pp. 1548

Author(s):

Jung Min Ahn ◽

Hongtae Kim ◽

Jae Gab Cho ◽

Taegu Kang ◽

Yong-seok Kim ◽

...

Keyword(s):

Water Pollution ◽

Decision Making ◽

Parallel Computation ◽

Message Passing ◽

Message Passing Interface ◽

Numerical Models ◽

Source Code ◽

Computation Time ◽

Environmental Research ◽

Quality Analysis

Process-based numerical models developed to perform hydraulic/hydrologic/water quality analysis of watersheds and rivers have become highly sophisticated, with a corresponding increase in their computation time. However, for incidents such as water pollution, rapid analysis and decision-making are critical. This paper proposes an optimized parallelization scheme to reduce the computation time of the Environmental Fluid Dynamics Code-National Institute of Environmental Research (EFDC-NIER) model, which has been continuously developed for water pollution or algal bloom prediction in rivers. An existing source code and a parallel computational code with open multi-processing (OpenMP) and a message passing interface (MPI) were optimized, and their computation times compared. Subsequently, the simulation results for the existing EFDC model and the model with the parallel computation code were compared. Furthermore, the optimal parallel combination for hybrid parallel computation was evaluated by comparing the simulation time based on the number of cores and threads. When code parallelization was applied, the performance improved by a factor of approximately five compared to the existing source code. Thus, if the parallel computational source code applied in this study is used, urgent decision-making will be easier for events such as water pollution incidents.

Download Full-text

A Hybrid Mpi-Openmp Implementation of an Implicit Finite-Element Code on Parallel Architectures

The International Journal of High Performance Computing Applications ◽

10.1177/109434200201600402 ◽

2002 ◽

Vol 16 (4) ◽

pp. 371-393 ◽

Cited By ~ 31

Author(s):

G. Mahinthakumar ◽

F. Saied

Keyword(s):

Finite Element ◽

Hybrid Model ◽

Message Passing ◽

Message Passing Interface ◽

Hybrid Approach ◽

Parallel Architectures ◽

Finite Element Code ◽

Multiple Threads ◽

Smp Clusters ◽

Performance Results

Summary The hybrid MPI-OpenMP model is a natural parallel programming paradigm for emerging parallel architectures that are based on symmetric multiprocessor (SMP) clusters. This paper presents a hybrid implementation adapted for an implicit finite-element code developed for groundwater transport simulations. The original code was parallelized for distributed memory architectures using MPI (Message Passing Interface) using a domain decomposition strategy. OpenMP directives were then added to the code (a straightforward loop-level implementation) to use multiple threads within each MPI process. To improve the OpenMP performance, several loop modifications were adopted. The parallel performance results are compared for four modern parallel architectures. The results show that for most of the cases tested, the pure MPI approach outperforms the hybrid model. The exceptions to this observation were mainly due to a limitation in the MPI library implementation on one of the architectures. A general conclusion is that while the hybrid model is a promising approach for SMP cluster architectures, at the time of this writing, the payoff may not be justified for converting all existing MPI codes to hybrid codes. However, improvements in OpenMP compilers combined with potential MPI limitations in SMP nodes may make the hybrid approach more attractive for a broader set of applications in the future.

Download Full-text

Coupled Peridynamics Least Square Minimization with Finite Element Method in 3D and Implicit Solutions by Message Passing Interface

Journal of Peridynamics and Nonlocal Modeling ◽

10.1007/s42102-021-00060-3 ◽

2021 ◽

Author(s):

Qibang Liu ◽

X. J. Xin ◽

Jeff Ma

Keyword(s):

Finite Element Method ◽

Finite Element ◽

Message Passing ◽

Message Passing Interface ◽

Least Square ◽

Element Method

Download Full-text

Parallelized Simulation of a Finite Element Method in Many Integrated Core Architecture

Journal of Engineering Materials and Technology ◽

10.1115/1.4035326 ◽

2017 ◽

Vol 139 (2) ◽

Cited By ~ 1

Author(s):

Moonho Tak ◽

Taehyo Park

Keyword(s):

Finite Element Method ◽

Finite Element ◽

Linear Algebra ◽

Message Passing ◽

Message Passing Interface ◽

Domain Decomposition Method ◽

Xeon Phi ◽

Parallel Libraries ◽

Many Integrated Core ◽

Element Method

We investigate a domain decomposition method (DDM) of finite element method (FEM) using Intel's many integrated core (MIC) architecture in order to determine the most effective MIC usage. For this, recently introduced high-scalable parallel method of DDM is first introduced with a detailed procedure. Then, the Intel's Xeon Phi MIC architecture is presented to understand how to apply the parallel algorithm into a multicore architecture. The parallel simulation using the Xeon Phi MIC has an advantage that traditional parallel libraries such as the message passing interface (MPI) and the open multiprocessing (OpenMP) can be used without any additional libraries. We demonstrate the DDM using popular libraries for solving linear algebra such as the linear algebra package (LAPACK) or the basic linear algebra subprograms (BLAS). Moreover, both MPI and OpenMP are used for parallel resolutions of the DDM. Finally, numerical parallel efficiencies are validated by a two-dimensional numerical example.

Download Full-text

Solution of finite element problems using hybrid parallelization with MPI and OpenMP

Acta Universitaria ◽

10.15174/au.2012.391 ◽

2012 ◽

Vol 22 (7) ◽

pp. 14-24 ◽

Cited By ~ 3

Author(s):

José Miguel Vargas-Félix ◽

Salvador Botello-Rionda

Keyword(s):

Finite Element ◽

Message Passing ◽

Message Passing Interface ◽

Sparse Matrices ◽

Heat Diffusion ◽

System Of Equations ◽

Systems Of Equations ◽

Computer Clusters ◽

Solid Deformation ◽

Hybrid Parallelization

The Finite Element Method (FEM) is used to solve problems like solid deformation and heat diffusion in domains with complex geometries. This kind of geometries requires discretization with millions of elements; this is equivalent to solve systems of equations with sparse matrices and tens or hundreds of millions of variables. The aim is to use computer clusters to solve these systems. The solution method used is Schur substructuration. Using it is possible to divide a large system of equations into many small ones to solve them more efficiently. This method allows parallelization. MPI (Message Passing Interface) is used to distribute the systems of equations to solve each one in a computer of a cluster. Each system of equations is solved using a solver implemented to use OpenMP as a local parallelization method.

Download Full-text