High Performance Computation by Multi-Node GPU Cluster-Tsubame2.0 on the Air Flow in an Urban City Using Lattice Boltzmann Method

Author(s):  
Xian Wang ◽  
Takayuki Aoki
Author(s):  
Zhi Shang ◽  
Ming Cheng ◽  
Jing Lou

Lattice Boltzmann method (LBM) is a new attractive computational approach for simulating isothermal multi-phase flows in computational fluid dynamics (CFD). It is based on the kinetic theory and easy to be parallelized. This study aims to analyze the performance of parallel LBM programming for the incompressible two-phase flows at high density and viscosity ratio. For this purpose, a liquid drop impact on a wetted wall with a pre-existing thin film of the same liquid is simulated by using the parallel LBM code. During the simulations, the domain decomposition, data communication and parallelization of the LBM code using the message passing interface (MPI) library have been investigated. The computational results show that the parallel LBM code exhibits a good high performance computing (HPC) on the parallel speed-up.


Author(s):  
Claudio Schepke ◽  
João V. F. Lima ◽  
Matheus S. Serpa

Currently NVIDIA GPUs and Intel Xeon Phi accelerators are alternatives of computational architectures to provide high performance. This chapter investigates the performance impact of these architectures on the lattice Boltzmann method. This method is an alternative to simulate fluid flows iteratively using discrete representations. It can be adopted for a large number of flows simulations using simple operation rules. In the experiments, it was considered a three-dimensional version of the method, with 19 discrete directions of propagation (D3Q19). Performance evaluation compare three modern GPUs: K20M, K80, and Titan X; and two architectures of Xeon Phi: Knights Corner (KNC) and Knights Landing (KNL). Titan X provides the fastest execution time of all hardware considered. The results show that GPUs offer better processing time for the application. A KNL cache implementation presents the best results for Xeon Phi architectures and the new Xeon Phi (KNL) is two times faster than the previous model (KNC).


Author(s):  
Timothy J. Spencer ◽  
Ian Halliday ◽  
Chris M. Care

The lattice Boltzmann method (LBM) for computational fluid dynamics benefits from a simple, explicit, completely local computational algorithm making it highly efficient. We extend LBM to recover hydrodynamics of multi-component immiscible fluids, while retaining a completely local, explicit and simple algorithm. Hence, no computationally expensive lattice gradients, interaction potentials or curvatures, that use information from neighbouring lattice sites, need to be calculated, which makes the method highly scalable and suitable for high performance parallel computing. The method is analytical and is shown to recover correct continuum hydrodynamic equations of motion and interfacial boundary conditions. This LBM may be further extended to situations containing a high number (O(100)) of individually immiscible drops. We make comparisons of the emergent non-Newtonian behaviour with a power-law fluid model. We anticipate our method will have a range applications in engineering, industrial and biological sciences.


2018 ◽  
Vol 168 ◽  
pp. 14-20 ◽  
Author(s):  
You-Hsun Lee ◽  
Li-Min Huang ◽  
You-Seng Zou ◽  
Shao-Ching Huang ◽  
Chao-An Lin

Computation ◽  
2020 ◽  
Vol 8 (2) ◽  
pp. 44
Author(s):  
Ivan Girotto ◽  
Sebastiano Fabio Schifano ◽  
Enrico Calore ◽  
Gianluca Di Staso ◽  
Federico Toschi

This paper presents the performance analysis for both the computing performance and the energy efficiency of a Lattice Boltzmann Method (LBM) based application, used to simulate three-dimensional multicomponent turbulent systems on massively parallel architectures for high-performance computing. Extending results reported in previous works, the analysis is meant to demonstrate the impact of using optimized data layouts designed for LBM based applications on high-end computer platforms. A particular focus is given to the Intel Skylake processor and to compare the target architecture with other models of the Intel processor family. We introduce the main motivations of the presented work as well as the relevance of its scientific application. We analyse the measured performances of the implemented data layouts on the Skylake processor while scaling the number of threads per socket. We compare the results obtained on several CPU generations of the Intel processor family and we make an analysis of energy efficiency on the Skylake processor compared with the Intel Xeon Phi processor, finally adding our interpretation of the presented results.


Mathematics ◽  
2021 ◽  
Vol 9 (15) ◽  
pp. 1793
Author(s):  
Michal Takáč ◽  
Ivo Petráš

This paper deals with the design and implementation of cross-platform, D2Q9-BGK and D3Q27-MRT, lattice Boltzmann method solver for 2D and 3D flows developed with ArrayFire library for high-performance computing. The solver leverages ArrayFire’s just-in-time compilation engine for compiling high-level code into optimized kernels for both CUDA and OpenCL GPU backends. We also provide C++ and Rust implementations and show that it is possible to produce fast cross-platform lattice Boltzmann method simulations with minimal code, effectively less than 90 lines of code. An illustrative benchmarks (lid-driven cavity and Kármán vortex street) for single and double precision floating-point simulations on 4 different GPUs are provided.


Sign in / Sign up

Export Citation Format

Share Document