High Performance Computation by Multi-Node GPU Cluster-Tsubame2.0 on the Air Flow in an Urban City Using Lattice Boltzmann Method

Lattice Boltzmann method (LBM) is a new attractive computational approach for simulating isothermal multi-phase flows in computational fluid dynamics (CFD). It is based on the kinetic theory and easy to be parallelized. This study aims to analyze the performance of parallel LBM programming for the incompressible two-phase flows at high density and viscosity ratio. For this purpose, a liquid drop impact on a wetted wall with a pre-existing thin film of the same liquid is simulated by using the parallel LBM code. During the simulations, the domain decomposition, data communication and parallelization of the LBM code using the message passing interface (MPI) library have been investigated. The computational results show that the parallel LBM code exhibits a good high performance computing (HPC) on the parallel speed-up.

Download Full-text

Challenges on Porting Lattice Boltzmann Method on Accelerators

Advances in Computer and Electrical Engineering - Analysis and Applications of Lattice Boltzmann Simulations ◽

10.4018/978-1-5225-4760-0.ch002 ◽

2018 ◽

pp. 30-53 ◽

Cited By ~ 1

Author(s):

Claudio Schepke ◽

João V. F. Lima ◽

Matheus S. Serpa

Keyword(s):

Lattice Boltzmann Method ◽

Lattice Boltzmann ◽

High Performance ◽

Three Dimensional ◽

Fluid Flows ◽

Xeon Phi ◽

Performance Impact ◽

Operation Rules ◽

Dimensional Version ◽

Boltzmann Method

Currently NVIDIA GPUs and Intel Xeon Phi accelerators are alternatives of computational architectures to provide high performance. This chapter investigates the performance impact of these architectures on the lattice Boltzmann method. This method is an alternative to simulate fluid flows iteratively using discrete representations. It can be adopted for a large number of flows simulations using simple operation rules. In the experiments, it was considered a three-dimensional version of the method, with 19 discrete directions of propagation (D3Q19). Performance evaluation compare three modern GPUs: K20M, K80, and Titan X; and two architectures of Xeon Phi: Knights Corner (KNC) and Knights Landing (KNL). Titan X provides the fastest execution time of all hardware considered. The results show that GPUs offer better processing time for the application. A KNL cache implementation presents the best results for Xeon Phi architectures and the new Xeon Phi (KNL) is two times faster than the previous model (KNC).

Download Full-text

A local lattice Boltzmann method for multiple immiscible fluids and dense suspensions of drops

Philosophical Transactions of The Royal Society A Mathematical Physical and Engineering Sciences ◽

10.1098/rsta.2011.0029 ◽

2011 ◽

Vol 369 (1944) ◽

pp. 2255-2263 ◽

Cited By ~ 12

Author(s):

Timothy J. Spencer ◽

Ian Halliday ◽

Chris M. Care

Keyword(s):

Biological Sciences ◽

Lattice Boltzmann Method ◽

Lattice Boltzmann ◽

High Performance ◽

Fluid Model ◽

Equations Of Motion ◽

Simple Algorithm ◽

Immiscible Fluids ◽

Local Lattice ◽

Boltzmann Method

The lattice Boltzmann method (LBM) for computational fluid dynamics benefits from a simple, explicit, completely local computational algorithm making it highly efficient. We extend LBM to recover hydrodynamics of multi-component immiscible fluids, while retaining a completely local, explicit and simple algorithm. Hence, no computationally expensive lattice gradients, interaction potentials or curvatures, that use information from neighbouring lattice sites, need to be calculated, which makes the method highly scalable and suitable for high performance parallel computing. The method is analytical and is shown to recover correct continuum hydrodynamic equations of motion and interfacial boundary conditions. This LBM may be further extended to situations containing a high number (O(100)) of individually immiscible drops. We make comparisons of the emergent non-Newtonian behaviour with a power-law fluid model. We anticipate our method will have a range applications in engineering, industrial and biological sciences.

Download Full-text

Simulations of turbulent duct flow with lattice Boltzmann method on GPU cluster

Computers & Fluids ◽

10.1016/j.compfluid.2018.03.064 ◽

2018 ◽

Vol 168 ◽

pp. 14-20 ◽

Cited By ~ 6

Author(s):

You-Hsun Lee ◽

Li-Min Huang ◽

You-Seng Zou ◽

Shao-Ching Huang ◽

Chao-An Lin

Keyword(s):

Lattice Boltzmann Method ◽

Lattice Boltzmann ◽

Duct Flow ◽

Gpu Cluster ◽

Turbulent Duct Flow ◽

Boltzmann Method

Download Full-text

Performance and Energy Assessment of a Lattice Boltzmann Method Based Application on the Skylake Processor

Computation ◽

10.3390/computation8020044 ◽

2020 ◽

Vol 8 (2) ◽

pp. 44

Author(s):

Ivan Girotto ◽

Sebastiano Fabio Schifano ◽

Enrico Calore ◽

Gianluca Di Staso ◽

Federico Toschi

Keyword(s):

Energy Efficiency ◽

Lattice Boltzmann Method ◽

Lattice Boltzmann ◽

High Performance ◽

Three Dimensional ◽

Scientific Application ◽

Energy Assessment ◽

Boltzmann Method ◽

The Impact ◽

Data Layouts

This paper presents the performance analysis for both the computing performance and the energy efficiency of a Lattice Boltzmann Method (LBM) based application, used to simulate three-dimensional multicomponent turbulent systems on massively parallel architectures for high-performance computing. Extending results reported in previous works, the analysis is meant to demonstrate the impact of using optimized data layouts designed for LBM based applications on high-end computer platforms. A particular focus is given to the Intel Skylake processor and to compare the target architecture with other models of the Intel processor family. We introduce the main motivations of the presented work as well as the relevance of its scientific application. We analyse the measured performances of the implemented data layouts on the Skylake processor while scaling the number of threads per socket. We compare the results obtained on several CPU generations of the Intel processor family and we make an analysis of energy efficiency on the Skylake processor compared with the Intel Xeon Phi processor, finally adding our interpretation of the presented results.

Download Full-text

High‐performance SIMD implementation of the lattice‐Boltzmann method on the Xeon Phi processor

Concurrency and Computation Practice and Experience ◽

10.1002/cpe.5072 ◽

2018 ◽

Vol 31 (13) ◽

pp. e5072 ◽

Cited By ~ 1

Author(s):

Fredrik Robertsén ◽

Keijo Mattila ◽

Jan Westerholm

Keyword(s):

Lattice Boltzmann Method ◽

Lattice Boltzmann ◽

High Performance ◽

Xeon Phi ◽

Boltzmann Method

Download Full-text

Provide a suitable range to include the thermal creeping effect on slip velocity and temperature jump of an air flow in a nanochannel by lattice Boltzmann method

Physica E Low-dimensional Systems and Nanostructures ◽

10.1016/j.physe.2016.08.021 ◽

2017 ◽

Vol 85 ◽

pp. 143-151 ◽

Cited By ~ 15

Author(s):

Arash Karimipour

Keyword(s):

Lattice Boltzmann Method ◽

Lattice Boltzmann ◽

Slip Velocity ◽

Temperature Jump ◽

Air Flow ◽

Boltzmann Method

Download Full-text

Cross-Platform GPU-Based Implementation of Lattice Boltzmann Method Solver Using ArrayFire Library

Mathematics ◽

10.3390/math9151793 ◽

2021 ◽

Vol 9 (15) ◽

pp. 1793

Author(s):

Michal Takáč ◽

Ivo Petráš

Keyword(s):

Lattice Boltzmann Method ◽

Lattice Boltzmann ◽

High Performance ◽

Just In Time ◽

Double Precision ◽

Driven Cavity ◽

Cross Platform ◽

High Level ◽

Boltzmann Method ◽

3D Flows

This paper deals with the design and implementation of cross-platform, D2Q9-BGK and D3Q27-MRT, lattice Boltzmann method solver for 2D and 3D flows developed with ArrayFire library for high-performance computing. The solver leverages ArrayFire’s just-in-time compilation engine for compiling high-level code into optimized kernels for both CUDA and OpenCL GPU backends. We also provide C++ and Rust implementations and show that it is possible to produce fast cross-platform lattice Boltzmann method simulations with minimal code, effectively less than 90 lines of code. An illustrative benchmarks (lid-driven cavity and Kármán vortex street) for single and double precision floating-point simulations on 4 different GPUs are provided.

Download Full-text