scholarly journals A comparison of MPI and co-array FORTRAN for large finite element variably saturated flow simulations

2018 ◽  
Vol 19 (4) ◽  
pp. 423-432
Author(s):  
Fred Thomas Tracy ◽  
Thomas C. Oppe ◽  
Maureen K. Corcoran

The purpose of this research is to determine how well co-array FORTRAN (CAF) performs relative to Message Passing Interface (MPI) on unstructured mesh finite element groundwater modelling applications with large problem sizes and core counts. This research used almost 150 million nodes and 300 million 3-D prism elements. Results for both the Cray XE6 and Cray XC30 are given. A comparison of the ghost-node update algorithms with source code provided for both MPI and CAF is also presented.

2013 ◽  
Vol 718-720 ◽  
pp. 1645-1650
Author(s):  
Gen Yin Cheng ◽  
Sheng Chen Yu ◽  
Zhi Yong Wei ◽  
Shao Jie Chen ◽  
You Cheng

Commonly used commercial simulation software SYSNOISE and ANSYS is run on a single machine (can not directly run on parallel machine) when use the finite element and boundary element to simulate muffler effect, and it will take more than ten days, sometimes even twenty days to work out an exact solution as the large amount of numerical simulation. Use a high performance parallel machine which was built by 32 commercial computers and transform the finite element and boundary element simulation software into a program that can running under the MPI (message passing interface) parallel environment in order to reduce the cost of numerical simulation. The relevant data worked out from the simulation experiment demonstrate that the result effect of the numerical simulation is well. And the computing speed of the high performance parallel machine is 25 ~ 30 times a microcomputer.


2015 ◽  
Vol 8 (3) ◽  
pp. 473-483
Author(s):  
I. Honkonen

Abstract. I present a method for developing extensible and modular computational models without sacrificing serial or parallel performance or source code readability. By using a generic simulation cell method I show that it is possible to combine several distinct computational models to run in the same computational grid without requiring modification of existing code. This is an advantage for the development and testing of, e.g., geoscientific software as each submodel can be developed and tested independently and subsequently used without modification in a more complex coupled program. An implementation of the generic simulation cell method presented here, generic simulation cell class (gensimcell), also includes support for parallel programming by allowing model developers to select which simulation variables of, e.g., a domain-decomposed model to transfer between processes via a Message Passing Interface (MPI) library. This allows the communication strategy of a program to be formalized by explicitly stating which variables must be transferred between processes for the correct functionality of each submodel and the entire program. The generic simulation cell class requires a C++ compiler that supports a version of the language standardized in 2011 (C++11). The code is available at https://github.com/nasailja/gensimcell for everyone to use, study, modify and redistribute; those who do are kindly requested to acknowledge and cite this work.


2016 ◽  
Vol 63 (4) ◽  
pp. 475-494 ◽  
Author(s):  
Thomas Volzer ◽  
Peter Eberhard

Abstract The use of elastic bodies within a multibody simulation became more and more important within the last years. To include the elastic bodies, described as a finite element model in multibody simulations, the dimension of the system of ordinary differential equations must be reduced by projection. For this purpose, in this work, the modal reduction method, a component mode synthesis based method and a moment-matching method are used. Due to the always increasing size of the non-reduced systems, the calculation of the projection matrix leads to a large demand of computational resources and cannot be done on usual serial computers with available memory. In this paper, the model reduction software Morembs++ is presented using a parallelization concept based on the message passing interface to satisfy the need of memory and reduce the runtime of the model reduction process. Additionally, the behaviour of the Block-Krylov-Schur eigensolver, implemented in the Anasazi package of the Trilinos project, is analysed with regard to the choice of the size of the Krylov base, the block size and the number of blocks. Besides, an iterative solver is considered within the CMS-based method.


Processes ◽  
2021 ◽  
Vol 9 (9) ◽  
pp. 1548
Author(s):  
Jung Min Ahn ◽  
Hongtae Kim ◽  
Jae Gab Cho ◽  
Taegu Kang ◽  
Yong-seok Kim ◽  
...  

Process-based numerical models developed to perform hydraulic/hydrologic/water quality analysis of watersheds and rivers have become highly sophisticated, with a corresponding increase in their computation time. However, for incidents such as water pollution, rapid analysis and decision-making are critical. This paper proposes an optimized parallelization scheme to reduce the computation time of the Environmental Fluid Dynamics Code-National Institute of Environmental Research (EFDC-NIER) model, which has been continuously developed for water pollution or algal bloom prediction in rivers. An existing source code and a parallel computational code with open multi-processing (OpenMP) and a message passing interface (MPI) were optimized, and their computation times compared. Subsequently, the simulation results for the existing EFDC model and the model with the parallel computation code were compared. Furthermore, the optimal parallel combination for hybrid parallel computation was evaluated by comparing the simulation time based on the number of cores and threads. When code parallelization was applied, the performance improved by a factor of approximately five compared to the existing source code. Thus, if the parallel computational source code applied in this study is used, urgent decision-making will be easier for events such as water pollution incidents.


Author(s):  
G. Mahinthakumar ◽  
F. Saied

Summary The hybrid MPI-OpenMP model is a natural parallel programming paradigm for emerging parallel architectures that are based on symmetric multiprocessor (SMP) clusters. This paper presents a hybrid implementation adapted for an implicit finite-element code developed for groundwater transport simulations. The original code was parallelized for distributed memory architectures using MPI (Message Passing Interface) using a domain decomposition strategy. OpenMP directives were then added to the code (a straightforward loop-level implementation) to use multiple threads within each MPI process. To improve the OpenMP performance, several loop modifications were adopted. The parallel performance results are compared for four modern parallel architectures. The results show that for most of the cases tested, the pure MPI approach outperforms the hybrid model. The exceptions to this observation were mainly due to a limitation in the MPI library implementation on one of the architectures. A general conclusion is that while the hybrid model is a promising approach for SMP cluster architectures, at the time of this writing, the payoff may not be justified for converting all existing MPI codes to hybrid codes. However, improvements in OpenMP compilers combined with potential MPI limitations in SMP nodes may make the hybrid approach more attractive for a broader set of applications in the future.


Author(s):  
Moonho Tak ◽  
Taehyo Park

We investigate a domain decomposition method (DDM) of finite element method (FEM) using Intel's many integrated core (MIC) architecture in order to determine the most effective MIC usage. For this, recently introduced high-scalable parallel method of DDM is first introduced with a detailed procedure. Then, the Intel's Xeon Phi MIC architecture is presented to understand how to apply the parallel algorithm into a multicore architecture. The parallel simulation using the Xeon Phi MIC has an advantage that traditional parallel libraries such as the message passing interface (MPI) and the open multiprocessing (OpenMP) can be used without any additional libraries. We demonstrate the DDM using popular libraries for solving linear algebra such as the linear algebra package (LAPACK) or the basic linear algebra subprograms (BLAS). Moreover, both MPI and OpenMP are used for parallel resolutions of the DDM. Finally, numerical parallel efficiencies are validated by a two-dimensional numerical example.


2012 ◽  
Vol 22 (7) ◽  
pp. 14-24 ◽  
Author(s):  
José Miguel Vargas-Félix ◽  
Salvador Botello-Rionda

The Finite Element Method (FEM) is used to solve problems like solid deformation and heat diffusion in domains with complex geometries. This kind of geometries requires discretization with millions of elements; this is equivalent to solve systems of equations with sparse matrices and tens or hundreds of millions of variables. The aim is to use computer clusters to solve these systems. The solution method used is Schur substructuration. Using it is possible to divide a large system of equations into many small ones to solve them more efficiently. This method allows parallelization. MPI (Message Passing Interface) is used to distribute the systems of equations to solve each one in a computer of a cluster. Each system of equations is solved using a solver implemented to use OpenMP as a local parallelization method.


2015 ◽  
Vol 16 (2) ◽  
Author(s):  
Fred T. Tray ◽  
Thomas C. Oppe ◽  
William A. Ward ◽  
Maureen K. Corcoran

Sign in / Sign up

Export Citation Format

Share Document