Lattice Boltzmann Simulations of Cavity Flows on Graphic Processing Unit with Memory Management

2017 ◽  
Vol 33 (6) ◽  
pp. 863-871 ◽  
Author(s):  
P. Y. Hong ◽  
L. M. Huang ◽  
C. Y. Chang ◽  
C. A. Lin

AbstractLattice Boltzmann method (LBM) is adopted to compute two and three-dimensional lid driven cavity flows to examine the influence of memory management on the computational performance using Graphics Processing Unit (GPU). Both single and multi-relaxation time LBM are adopted. The computations are conducted on nVIDIA GeForce Titan, Tesla C2050 and GeForce GTX 560Ti. The performance using global memory deteriorates greatly when multi relaxation time (MRT) LBM is used, which is due to the scheme requesting more information from the global memory than its single relaxation time (SRT) LBM counterpart. On the other hand, adopting on chip memory the difference using MRT and SRT is not significant. Also, performance of LBM streaming procedure using offset reading surpasses offset writing ranging from 50% to 100% and this applies to both SRT and MRT LBM. Finally, comparisons using different GPU platforms indicate that Titan as expected outperforms other devices, and attains 227 and 193 speedup over its Intel Core i7-990 CPU counterpart and four times faster than GTX 560Ti and Tesla C2050 for three dimensional cavity flow simulations respectively with single and double precisions.

2021 ◽  
Vol 87 (5) ◽  
pp. 363-373
Author(s):  
Long Chen ◽  
Bo Wu ◽  
Yao Zhao ◽  
Yuan Li

Real-time acquisition and analysis of three-dimensional (3D) human body kinematics are essential in many applications. In this paper, we present a real-time photogrammetric system consisting of a stereo pair of red-green-blue (RGB) cameras. The system incorporates a multi-threaded and graphics processing unit (GPU)-accelerated solution for real-time extraction of 3D human kinematics. A deep learning approach is adopted to automatically extract two-dimensional (2D) human body features, which are then converted to 3D features based on photogrammetric processing, including dense image matching and triangulation. The multi-threading scheme and GPU-acceleration enable real-time acquisition and monitoring of 3D human body kinematics. Experimental analysis verified that the system processing rate reached ∼18 frames per second. The effective detection distance reached 15 m, with a geometric accuracy of better than 1% of the distance within a range of 12 m. The real-time measurement accuracy for human body kinematics ranged from 0.8% to 7.5%. The results suggest that the proposed system is capable of real-time acquisition and monitoring of 3D human kinematics with favorable performance, showing great potential for various applications.


Author(s):  
Hui Huang ◽  
Jian Chen ◽  
Blair Carlson ◽  
Hui-Ping Wang ◽  
Paul Crooker ◽  
...  

Due to enormous computation cost, current residual stress simulation of multipass girth welds are mostly performed using two-dimensional (2D) axisymmetric models. The 2D model can only provide limited estimation on the residual stresses by assuming its axisymmetric distribution. In this study, a highly efficient thermal-mechanical finite element code for three dimensional (3D) model has been developed based on high performance Graphics Processing Unit (GPU) computers. Our code is further accelerated by considering the unique physics associated with welding processes that are characterized by steep temperature gradient and a moving arc heat source. It is capable of modeling large-scale welding problems that cannot be easily handled by the existing commercial simulation tools. To demonstrate the accuracy and efficiency, our code was compared with a commercial software by simulating a 3D multi-pass girth weld model with over 1 million elements. Our code achieved comparable solution accuracy with respect to the commercial one but with over 100 times saving on computational cost. Moreover, the three-dimensional analysis demonstrated more realistic stress distribution that is not axisymmetric in hoop direction.


2011 ◽  
Vol 110-116 ◽  
pp. 2740-2745
Author(s):  
Kirana Kumara P. ◽  
Ashitava Ghosal

Real-time simulation of deformable solids is essential for some applications such as biological organ simulations for surgical simulators. In this work, deformable solids are approximated to be linear elastic, and an easy and straight forward numerical technique, the Finite Point Method (FPM), is used to model three dimensional linear elastostatics. Graphics Processing Unit (GPU) is used to accelerate computations. Results show that the Finite Point Method, together with GPU, can compute three dimensional linear elastostatic responses of solids at rates suitable for real-time graphics, for solids represented by reasonable number of points.


Author(s):  
Minglei Shan ◽  
Yu Yang ◽  
Hao Peng ◽  
Qingbang Han ◽  
Changping Zhu

Understanding the dynamic characteristic of the cavitation bubble near a solid wall is a fundamental issue for the bubble collapse application and prevention. In the present work, an improved three-dimensional multi-relaxation-time pseudopotential lattice Boltzmann model is adopted to investigate the cavitation bubble collapse near the solid wall. With respect to thermodynamic consistency, Laplace law verification, the three-dimensional pseudopotential multi-relaxation-time lattice Boltzmann model is investigated. By the theoretical analysis, it is proved that the model can be regarded as a solver of the Rayleigh–Plesset equation, and confirmed by comparing the results of the lattice Boltzmann simulation and the Rayleigh–Plesset equation calculation for the case of cavitation bubble collapse in the infinite medium field. The bubble collapse near the solid wall is modeled using the improved pseudopotential multi-relaxation-time lattice Boltzmann model. We find the lattice Boltzmann simulation and the experimental results have the same dynamic process by comparing the bubble profiles evolution. Form the pressure field and the velocity field evolution it is found that the tapered higher pressure region formed near the top of the bubble is a crucial driving force inducing the bubble collapse. This exploratory research demonstrates that the lattice Boltzmann method is an alternative tool for the study of the interaction between collapsing cavitation bubble and matter.


2020 ◽  
Vol 10 (7) ◽  
pp. 2359
Author(s):  
Sajad Mohammadi ◽  
Hamidreza Karami ◽  
Mohammad Azadifar ◽  
Farhad Rachidi

An open accelerator (OpenACC)-aided graphics processing unit (GPU)-based finite difference time domain (FDTD) method is presented for the first time for the 3D evaluation of lightning radiated electromagnetic fields along a complex terrain with arbitrary topography. The OpenACC directive-based programming model is used to enhance the computational performance, and the results are compared with those obtained by using a CPU-based model. It is shown that OpenACC GPUs can provide very accurate results, and they are more than 20 times faster than CPUs. The presented results support the use of OpenACC not only in relation to lightning electromagnetics problems, but also to large-scale realistic electromagnetic compatibility (EMC) applications in which computation time efficiency is a critical factor.


2019 ◽  
Vol 2019 ◽  
pp. 1-15 ◽  
Author(s):  
Jianqi Lai ◽  
Hua Li ◽  
Zhengyu Tian ◽  
Ye Zhang

Computational fluid dynamics (CFD) plays an important role in the optimal design of aircraft and the analysis of complex flow mechanisms in the aerospace domain. The graphics processing unit (GPU) has a strong floating-point operation capability and a high memory bandwidth in data parallelism, which brings great opportunities for CFD. A cell-centred finite volume method is applied to solve three-dimensional compressible Navier–Stokes equations on structured meshes with an upwind AUSM+UP numerical scheme for space discretization, and four-stage Runge–Kutta method is used for time discretization. Compute unified device architecture (CUDA) is used as a parallel computing platform and programming model for GPUs, which reduces the complexity of programming. The main purpose of this paper is to design an extremely efficient multi-GPU parallel algorithm based on MPI+CUDA to study the hypersonic flow characteristics. Solutions of hypersonic flow over an aerospace plane model are provided at different Mach numbers. The agreement between numerical computations and experimental measurements is favourable. Acceleration performance of the parallel platform is studied with single GPU, two GPUs, and four GPUs. For single GPU implementation, the speedup reaches 63 for the coarser mesh and 78 for the finest mesh. GPUs are better suited for compute-intensive tasks than traditional CPUs. For multi-GPU parallelization, the speedup of four GPUs reaches 77 for the coarser mesh and 147 for the finest mesh; this is far greater than the acceleration achieved by single GPU and two GPUs. It is prospective to apply the multi-GPU parallel algorithm to hypersonic flow computations.


Sign in / Sign up

Export Citation Format

Share Document