Hybrid Parallelization Techniques

A domain decomposition strategy for hybrid parallelization of moving particle semi-implicit (MPS) method for computer cluster

Cluster Computing ◽

10.1007/s10586-015-0483-3 ◽

2015 ◽

Vol 18 (4) ◽

pp. 1363-1377 ◽

Cited By ~ 6

Author(s):

Davi Teodoro Fernandes ◽

Liang-Yee Cheng ◽

Eric Henrique Favero ◽

Kazuo Nishimoto

Keyword(s):

Domain Decomposition ◽

Computer Cluster ◽

Mps Method ◽

Hybrid Parallelization ◽

Moving Particle

Download Full-text

Towards a hybrid parallelization of lattice Boltzmann methods

Computers & Mathematics with Applications ◽

10.1016/j.camwa.2009.04.001 ◽

2009 ◽

Vol 58 (5) ◽

pp. 1071-1080 ◽

Cited By ~ 34

Author(s):

Vincent Heuveline ◽

Mathias J. Krause ◽

Jonas Latt

Keyword(s):

Lattice Boltzmann ◽

Lattice Boltzmann Methods ◽

Hybrid Parallelization

Download Full-text

Hybrid Parallelization of a Large-Scale Heart Model

Facing the Multicore - Challenge II - Lecture Notes in Computer Science ◽

10.1007/978-3-642-30397-5_11 ◽

2012 ◽

pp. 120-132 ◽

Cited By ~ 7

Author(s):

Dorian Krause ◽

Mark Potse ◽

Thomas Dickopf ◽

Rolf Krause ◽

Angelo Auricchio ◽

...

Keyword(s):

Large Scale ◽

Heart Model ◽

Hybrid Parallelization

Download Full-text

Peta-Scale Hierarchical Hybrid Multigrid Using Hybrid Parallelization

Large-Scale Scientific Computing - Lecture Notes in Computer Science ◽

10.1007/978-3-662-43880-0_50 ◽

2014 ◽

pp. 439-447 ◽

Cited By ~ 2

Author(s):

Björn Gmeiner ◽

Ulrich Rüde

Keyword(s):

Hybrid Parallelization

Download Full-text

A Hybrid Parallelization of Air Quality Model with MPI and OpenMP

Recent Advances in the Message Passing Interface - Lecture Notes in Computer Science ◽

10.1007/978-3-642-33518-1_28 ◽

2012 ◽

pp. 235-245

Author(s):

Gian Franco Marras ◽

Camillo Silibello ◽

Giuseppe Calori

Keyword(s):

Air Quality ◽

Air Quality Model ◽

Quality Model ◽

Hybrid Parallelization

Download Full-text

petar: a high-performance N-body code for modelling massive collisional stellar systems

Monthly Notices of the Royal Astronomical Society ◽

10.1093/mnras/staa1915 ◽

2020 ◽

Vol 497 (1) ◽

pp. 536-555 ◽

Cited By ~ 1

Author(s):

Long Wang ◽

Masaki Iwasawa ◽

Keigo Nitadori ◽

Junichiro Makino

Keyword(s):

Globular Clusters ◽

High Performance ◽

Computational Time ◽

Stellar Systems ◽

Processing Unit ◽

Desktop Computer ◽

Multiple Systems ◽

Hybrid Parallelization ◽

Good Agreement

ABSTRACT The numerical simulations of massive collisional stellar systems, such as globular clusters (GCs), are very time consuming. Until now, only a few realistic million-body simulations of GCs with a small fraction of binaries ($5{{\ \rm per\ cent}}$) have been performed by using the nbody6++gpu code. Such models took half a year computational time on a Graphic Processing Unit (GPU)-based supercomputer. In this work, we develop a new N-body code, petar, by combining the methods of Barnes–Hut tree, Hermite integrator and slow-down algorithmic regularization. The code can accurately handle an arbitrary fraction of multiple systems (e.g. binaries and triples) while keeping a high performance by using the hybrid parallelization methods with mpi, openmp, simd instructions and GPU. A few benchmarks indicate that petar and nbody6++gpu have a very good agreement on the long-term evolution of the global structure, binary orbits and escapers. On a highly configured GPU desktop computer, the performance of a million-body simulation with all stars in binaries by using petar is 11 times faster than that of nbody6++gpu. Moreover, on the Cray XC50 supercomputer, petar well scales when number of cores increase. The 10 million-body problem, which covers the region of ultracompact dwarfs and nuclear star clusters, becomes possible to be solved.

Download Full-text

Adapting ROMS to Execute on GRID Using a Hybrid Parallelization Model

10.1109/advcomp.2008.24 ◽

2008 ◽

Cited By ~ 1

Author(s):

Carmen Cotelo Queijo ◽

Anders Gomez Tato ◽

Ignacio Lopez Cabido ◽

Jose Manuel Cotos Yanez

Keyword(s):

Hybrid Parallelization

Download Full-text

High-Performance Parallel Simulation of Airflow for Complex Terrain Surface

Modelling and Simulation in Engineering ◽

10.1155/2019/5231839 ◽

2019 ◽

Vol 2019 ◽

pp. 1-10

Author(s):

Kenji Ono ◽

Takanori Uchida

Keyword(s):

Mesh Generation ◽

High Performance ◽

Power Plants ◽

Domain Decomposition Method ◽

Simulation Method ◽

Two Stage ◽

Hybrid Parallelization ◽

Strong Scaling ◽

Computing Performance ◽

Two Stages

It is important to develop a reliable and high-throughput simulation method for predicting airflows in the installation planning phase of windmill power plants. This study proposes a two-stage mesh generation approach to reduce the meshing cost and introduces a hybrid parallelization scheme for atmospheric fluid simulations. The meshing approach splits mesh generation into two stages: in the first stage, the meshing parameters that uniquely determine the mesh distribution are extracted, and in the second stage, a mesh system is generated in parallel via an in situ approach using the parameters obtained in the initialization phase of the simulation. The proposed two-stage approach is flexible since an arbitrary number of processes can be selected at run time. An efficient OpenMP-MPI hybrid parallelization scheme using a middleware that provides a framework of parallel codes based on the domain decomposition method is also developed. The preliminary results of the meshing and computing performance show excellent scalability in the strong scaling test.

Download Full-text

Hybrid Parallelization of Particle in Cell Monte Carlo Collision (PIC-MCC) Algorithm for Simulation of Low Temperature Plasmas

Communications in Computer and Information Science - Software Challenges to Exascale Computing ◽

10.1007/978-981-13-7729-7_3 ◽

2019 ◽

pp. 32-53

Author(s):

Bhaskar Chaudhury ◽

Mihir Shah ◽

Unnati Parekh ◽

Hasnain Gandhi ◽

Paramjeet Desai ◽

...

Keyword(s):

Monte Carlo ◽

Low Temperature ◽

Particle In Cell ◽

Hybrid Parallelization ◽

Low Temperature Plasmas

Download Full-text

The Same-Source Parallel MM5

Scientific Programming ◽

10.1155/2000/712795 ◽

2000 ◽

Vol 8 (1) ◽

pp. 5-12 ◽

Cited By ~ 6

Author(s):

John Michalakes

Keyword(s):

Shared Memory ◽

Distributed Memory ◽

Mesoscale Model ◽

State University ◽

Parallel Performance ◽

Hybrid Parallelization ◽

Beowulf Clusters ◽

Penn State ◽

Library Support ◽

The Impact

Beginning with the March 1998 release of the Penn State University/NCAR Mesoscale Model (MM5), and continuing through eight subsequent releases up to the present, the official version has run on distributed -memory (DM) parallel computers. Source translation and runtime library support minimize the impact of parallelization on the original model source code, with the result that the majority of code is line-for-line identical with the original version. Parallel performance and scaling are equivalent to earlier, hand-parallelized versions; the modifications have no effect when the code is compiled and run without the DM option. Supported computers include the IBM SP, Cray T3E, Fujitsu VPP, Compaq Alpha clusters, and clusters of PCs (so-called Beowulf clusters). The approach also is compatible with shared-memory parallel directives, allowing distributed-memory/shared-memory hybrid parallelization on distributed-memory clusters of symmetric multiprocessors.

Download Full-text