A parallel computing approach to viewshed analysis of large terrain data using graphics processing units

An implementation with the CUDA technology in a single and in several graphics processing units (GPUs) is presented for the calculation of the forward modeling of gravitational fields from a tridimensional volumetric ensemble composed by unitary prisms of constant density. We compared the performance results obtained with the GPUs against a previous version coded in OpenMP with MPI, and we analyzed the results on both platforms. Today, the use of GPUs represents a breakthrough in parallel computing, which has led to the development of several applications with various applications. Nevertheless, in some applications the decomposition of the tasks is not trivial, as can be appreciated in this paper. Unlike a trivial decomposition of the domain, we proposed to decompose the problem by sets of prisms and use different memory spaces per processing CUDA core, avoiding the performance decay as a result of the constant calls to kernels functions which would be needed in a parallelization by observations points. The design and implementation created are the main contributions of this work, because the parallelization scheme implemented is not trivial. The performance results obtained are comparable to those of a small processing cluster.

Download Full-text

A distributed computing approach to improve the performance of the Parallel Ocean Program (v2.1)

Geoscientific Model Development ◽

10.5194/gmd-7-267-2014 ◽

2014 ◽

Vol 7 (1) ◽

pp. 267-281 ◽

Cited By ~ 12

Author(s):

B. van Werkhoven ◽

J. Maassen ◽

M. Kliphuis ◽

H. A. Dijkstra ◽

S. E. Brunnabend ◽

...

Keyword(s):

Distributed Computing ◽

Load Balancing ◽

Ocean Circulation ◽

Graphics Processing Units ◽

Block Partitioning ◽

Model Code ◽

Graphics Processing ◽

Computing Approach ◽

Parallel Ocean Program

Abstract. The Parallel Ocean Program (POP) is used in many strongly eddying ocean circulation simulations. Ideally it would be desirable to be able to do thousand-year-long simulations, but the current performance of POP prohibits these types of simulations. In this work, using a new distributed computing approach, two methods to improve the performance of POP are presented. The first is a block-partitioning scheme for the optimization of the load balancing of POP such that it can be run efficiently in a multi-platform setting. The second is the implementation of part of the POP model code on graphics processing units (GPUs). We show that the combination of both innovations also leads to a substantial performance increase when running POP simultaneously over multiple computational platforms.

Download Full-text

A distributed computing approach to improve the performance of the Parallel Ocean Program (v2.1)

Geoscientific Model Development Discussions ◽

10.5194/gmdd-6-4705-2013 ◽

2013 ◽

Vol 6 (3) ◽

pp. 4705-4744 ◽

Cited By ~ 1

Author(s):

B. van Werkhoven ◽

J. Maassen ◽

M. Kliphuis ◽

H. A. Dijkstra ◽

S. E. Brunnabend ◽

...

Keyword(s):

Distributed Computing ◽

Load Balancing ◽

Ocean Circulation ◽

Graphics Processing Units ◽

Block Partitioning ◽

Model Code ◽

Graphics Processing ◽

Computing Approach ◽

Parallel Ocean Program

Abstract. The Parallel Ocean Program (POP) is used in many strongly eddying ocean circulation simulations. Ideally one would like to do thousand-year long simulations, but the current performance of POP prohibits this type of simulations. In this work, using a new distributed computing approach, two innovations to improve the performance of POP are presented. The first is a new block partitioning scheme for the optimization of the load balancing of POP such that it can be run efficiently in a multi-platform setting. The second is an implementation of part of the POP model code on Graphics Processing Units. We show that the combination of both innovations leads to a substantial performance increase also when running POP simultaneously over multiple computational platforms.

Download Full-text

Parallel computing for simultaneous iterative tomographic imaging by graphics processing units

10.1117/12.2223466 ◽

2016 ◽

Author(s):

Pedro D. Bello-Maldonado ◽

Ricardo López ◽

Colleen Rogers ◽

Yuanwei Jin ◽

Enyue Lu

Keyword(s):

Parallel Computing ◽

Graphics Processing Units ◽

Tomographic Imaging ◽

Graphics Processing

Download Full-text

Parallel computing with graphics processing units for high-speed Monte Carlo simulation of photon migration

Journal of Biomedical Optics ◽

10.1117/1.3041496 ◽

2008 ◽

Vol 13 (6) ◽

pp. 060504 ◽

Cited By ~ 232

Author(s):

Erik Alerstam ◽

Tomas Svensson ◽

Stefan Andersson-Engels

Keyword(s):

Monte Carlo Simulation ◽

Monte Carlo ◽

Parallel Computing ◽

Graphics Processing Units ◽

High Speed ◽

Photon Migration ◽

Graphics Processing

Download Full-text

Time-Domain Power Quality State Estimation Based on Kalman Filter Using Parallel Computing on Graphics Processing Units

IEEE Access ◽

10.1109/access.2018.2823721 ◽

2018 ◽

Vol 6 ◽

pp. 21152-21163 ◽

Cited By ~ 7

Author(s):

Rafael Cisneros-Magana ◽

Aurelio Medina ◽

Venkata Dinavahi ◽

Antonio Ramos-Paz

Keyword(s):

Parallel Computing ◽

Kalman Filter ◽

State Estimation ◽

Power Quality ◽

Time Domain ◽

Graphics Processing Units ◽

Graphics Processing

Download Full-text

A high performance approach for parallel computing of fibre Bragg grating strain profiles using graphics processing units

International Journal of High Performance Systems Architecture ◽

10.1504/ijhpsa.2016.081743 ◽

2016 ◽

Vol 6 (4) ◽

pp. 197

Author(s):

L.H. Negri ◽

H.S. Lopes ◽

M. Muller ◽

J.L. Fabris ◽

A.S. Paterno

Keyword(s):

Parallel Computing ◽

Graphics Processing Units ◽

High Performance ◽

Bragg Grating ◽

Fibre Bragg Grating ◽

Graphics Processing

Download Full-text

Use of GPU Computing for Uncertainty Quantification in Computational Mechanics: A Case Study

Scientific Programming ◽

10.1155/2011/730213 ◽

2011 ◽

Vol 19 (4) ◽

pp. 199-212 ◽

Cited By ~ 3

Author(s):

Gaurav ◽

Steven F. Wojtkiewicz

Keyword(s):

Parallel Computing ◽

Uncertainty Quantification ◽

Graphics Processing Units ◽

Computational Mechanics ◽

Gpu Computing ◽

Single Instruction Multiple Data ◽

Performance Constraints ◽

Multiple Data ◽

Graphics Processing

Graphics processing units (GPUs) are rapidly emerging as a more economical and highly competitive alternative to CPU-based parallel computing. As the degree of software control of GPUs has increased, many researchers have explored their use in non-gaming applications. Recent studies have shown that GPUs consistently outperform their best corresponding CPU-based parallel computing alternatives in single-instruction multiple-data (SIMD) strategies. This study explores the use of GPUs for uncertainty quantification in computational mechanics. Five types of analysis procedures that are frequently utilized for uncertainty quantification of mechanical and dynamical systems have been considered and their GPU implementations have been developed. The numerical examples presented in this study show that considerable gains in computational efficiency can be obtained for these procedures. It is expected that the GPU implementations presented in this study will serve as initial bases for further developments in the use of GPUs in the field of uncertainty quantification and will (i) aid the understanding of the performance constraints on the relevant GPU kernels and (ii) provide some guidance regarding the computational and the data structures to be utilized in these novel GPU implementations.

Download Full-text

Fluid-film lubrication computing with many-core processors and graphics processing units

Advances in Mechanical Engineering ◽

10.1177/1687814018804719 ◽

2018 ◽

Vol 10 (10) ◽

pp. 168781401880471

Author(s):

Nenzi Wang ◽

Hsin-Yi Chen ◽

Yu-Wen Chen

Keyword(s):

Parallel Computing ◽

Graphics Processing Units ◽

Fluid Film ◽

The Many ◽

Many Core ◽

Graphics Processing ◽

Processor Cores ◽

Many Integrated Core ◽

Film Lubrication ◽

Fluid Film Lubrication

The advancement of modern processors with many-core and large-cache may have little computational advantages if only serial computing is employed. In this study, several parallel computing approaches, using devices with multiple or many processor cores, and graphics processing units are applied and compared to illustrate the potential applications in fluid-film lubrication study. Two Reynolds equations and an air bearing optimum design are solved using three parallel computing paradigms, OpenMP, Compute Unified Device Architecture, and OpenACC, on standalone shared-memory computers. The newly developed processors with many-integrated-core are also using OpenMP to release the computing potential. The results show that the OpenACC computing can have a better performance than the OpenMP computing for the discretized Reynolds equation with a large gridwork. This is mainly due to larger sizes of available cache in the tested graphics processing units. The bearing design can benefit most when the system with many-integrated-core processor is being used. This is due to the many-integrated-core system can perform computation in the optimization-algorithm-level and using the many processor cores effectively. A proper combination of parallel computing devices and programming models can complement efficient numerical methods or optimization algorithms to accelerate many tribological simulations or engineering designs.

Download Full-text

Parallel Power Flow Computation Trends and Applications: A Review Focusing on GPU

Energies ◽

10.3390/en13092147 ◽

2020 ◽

Vol 13 (9) ◽

pp. 2147

Author(s):

Dong-Hee Yoon ◽

Youngsun Han

Keyword(s):

Parallel Computing ◽

Power Systems ◽

Power System ◽

Graphics Processing Units ◽

Power Flow ◽

Computation Time ◽

Computing Methods ◽

Flow Computation ◽

Fast Power ◽

Graphics Processing

A power flow study aims to analyze a power system by obtaining the voltage and phase angle of buses inside the power system. Power flow computation basically uses a numerical method to solve a nonlinear system, which takes a certain amount of time because it may take many iterations to find the final solution. In addition, as the size and complexity of power systems increase, further computational power is required for power system study. Therefore, there have been many attempts to conduct power flow computation with large amounts of data using parallel computing to reduce the computation time. Furthermore, with recent system developments, attempts have been made to increase the speed of parallel computing using graphics processing units (GPU). In this review paper, we summarize issues related to parallel processing in power flow studies and analyze research into the performance of fast power flow computations using parallel computing methods with GPU.

Download Full-text