A parallel computing approach to viewshed analysis of large terrain data using graphics processing units

2013 ◽  
Vol 27 (2) ◽  
pp. 363-384 ◽  
Author(s):  
Yanli Zhao ◽  
Anand Padmanabhan ◽  
Shaowen Wang
2013 ◽  
Vol 2013 ◽  
pp. 1-15 ◽  
Author(s):  
Carlos Couder-Castañeda ◽  
Carlos Ortiz-Alemán ◽  
Mauricio Gabriel Orozco-del-Castillo ◽  
Mauricio Nava-Flores

An implementation with the CUDA technology in a single and in several graphics processing units (GPUs) is presented for the calculation of the forward modeling of gravitational fields from a tridimensional volumetric ensemble composed by unitary prisms of constant density. We compared the performance results obtained with the GPUs against a previous version coded in OpenMP with MPI, and we analyzed the results on both platforms. Today, the use of GPUs represents a breakthrough in parallel computing, which has led to the development of several applications with various applications. Nevertheless, in some applications the decomposition of the tasks is not trivial, as can be appreciated in this paper. Unlike a trivial decomposition of the domain, we proposed to decompose the problem by sets of prisms and use different memory spaces per processing CUDA core, avoiding the performance decay as a result of the constant calls to kernels functions which would be needed in a parallelization by observations points. The design and implementation created are the main contributions of this work, because the parallelization scheme implemented is not trivial. The performance results obtained are comparable to those of a small processing cluster.


2014 ◽  
Vol 7 (1) ◽  
pp. 267-281 ◽  
Author(s):  
B. van Werkhoven ◽  
J. Maassen ◽  
M. Kliphuis ◽  
H. A. Dijkstra ◽  
S. E. Brunnabend ◽  
...  

Abstract. The Parallel Ocean Program (POP) is used in many strongly eddying ocean circulation simulations. Ideally it would be desirable to be able to do thousand-year-long simulations, but the current performance of POP prohibits these types of simulations. In this work, using a new distributed computing approach, two methods to improve the performance of POP are presented. The first is a block-partitioning scheme for the optimization of the load balancing of POP such that it can be run efficiently in a multi-platform setting. The second is the implementation of part of the POP model code on graphics processing units (GPUs). We show that the combination of both innovations also leads to a substantial performance increase when running POP simultaneously over multiple computational platforms.


2013 ◽  
Vol 6 (3) ◽  
pp. 4705-4744 ◽  
Author(s):  
B. van Werkhoven ◽  
J. Maassen ◽  
M. Kliphuis ◽  
H. A. Dijkstra ◽  
S. E. Brunnabend ◽  
...  

Abstract. The Parallel Ocean Program (POP) is used in many strongly eddying ocean circulation simulations. Ideally one would like to do thousand-year long simulations, but the current performance of POP prohibits this type of simulations. In this work, using a new distributed computing approach, two innovations to improve the performance of POP are presented. The first is a new block partitioning scheme for the optimization of the load balancing of POP such that it can be run efficiently in a multi-platform setting. The second is an implementation of part of the POP model code on Graphics Processing Units. We show that the combination of both innovations leads to a substantial performance increase also when running POP simultaneously over multiple computational platforms.


2016 ◽  
Author(s):  
Pedro D. Bello-Maldonado ◽  
Ricardo López ◽  
Colleen Rogers ◽  
Yuanwei Jin ◽  
Enyue Lu

IEEE Access ◽  
2018 ◽  
Vol 6 ◽  
pp. 21152-21163 ◽  
Author(s):  
Rafael Cisneros-Magana ◽  
Aurelio Medina ◽  
Venkata Dinavahi ◽  
Antonio Ramos-Paz

2011 ◽  
Vol 19 (4) ◽  
pp. 199-212 ◽  
Author(s):  
Gaurav ◽  
Steven F. Wojtkiewicz

Graphics processing units (GPUs) are rapidly emerging as a more economical and highly competitive alternative to CPU-based parallel computing. As the degree of software control of GPUs has increased, many researchers have explored their use in non-gaming applications. Recent studies have shown that GPUs consistently outperform their best corresponding CPU-based parallel computing alternatives in single-instruction multiple-data (SIMD) strategies. This study explores the use of GPUs for uncertainty quantification in computational mechanics. Five types of analysis procedures that are frequently utilized for uncertainty quantification of mechanical and dynamical systems have been considered and their GPU implementations have been developed. The numerical examples presented in this study show that considerable gains in computational efficiency can be obtained for these procedures. It is expected that the GPU implementations presented in this study will serve as initial bases for further developments in the use of GPUs in the field of uncertainty quantification and will (i) aid the understanding of the performance constraints on the relevant GPU kernels and (ii) provide some guidance regarding the computational and the data structures to be utilized in these novel GPU implementations.


2018 ◽  
Vol 10 (10) ◽  
pp. 168781401880471
Author(s):  
Nenzi Wang ◽  
Hsin-Yi Chen ◽  
Yu-Wen Chen

The advancement of modern processors with many-core and large-cache may have little computational advantages if only serial computing is employed. In this study, several parallel computing approaches, using devices with multiple or many processor cores, and graphics processing units are applied and compared to illustrate the potential applications in fluid-film lubrication study. Two Reynolds equations and an air bearing optimum design are solved using three parallel computing paradigms, OpenMP, Compute Unified Device Architecture, and OpenACC, on standalone shared-memory computers. The newly developed processors with many-integrated-core are also using OpenMP to release the computing potential. The results show that the OpenACC computing can have a better performance than the OpenMP computing for the discretized Reynolds equation with a large gridwork. This is mainly due to larger sizes of available cache in the tested graphics processing units. The bearing design can benefit most when the system with many-integrated-core processor is being used. This is due to the many-integrated-core system can perform computation in the optimization-algorithm-level and using the many processor cores effectively. A proper combination of parallel computing devices and programming models can complement efficient numerical methods or optimization algorithms to accelerate many tribological simulations or engineering designs.


Energies ◽  
2020 ◽  
Vol 13 (9) ◽  
pp. 2147
Author(s):  
Dong-Hee Yoon ◽  
Youngsun Han

A power flow study aims to analyze a power system by obtaining the voltage and phase angle of buses inside the power system. Power flow computation basically uses a numerical method to solve a nonlinear system, which takes a certain amount of time because it may take many iterations to find the final solution. In addition, as the size and complexity of power systems increase, further computational power is required for power system study. Therefore, there have been many attempts to conduct power flow computation with large amounts of data using parallel computing to reduce the computation time. Furthermore, with recent system developments, attempts have been made to increase the speed of parallel computing using graphics processing units (GPU). In this review paper, we summarize issues related to parallel processing in power flow studies and analyze research into the performance of fast power flow computations using parallel computing methods with GPU.


Sign in / Sign up

Export Citation Format

Share Document