Parallel Computation of a Mixed Convection Problem Using Fully-Coupled and Segregated Algorithms

Heat Transfer, Volume 2 ◽

10.1115/imece2004-61879 ◽

2004 ◽

Cited By ~ 1

Author(s):

Masoud Darbandi ◽

Araz Banaeizadeh ◽

Gerry E. Schneider

Keyword(s):

Mixed Convection ◽

Parallel Computation ◽

Large Scale ◽

Stokes Equations ◽

Computational Time ◽

Linear Algebraic Equations ◽

The Matrix ◽

Convection Problem ◽

Run Time ◽

Fully Coupled

In this work, parallel solution of the Navier-Stokes equations for a mixed convection heat problem is achieved using a finite-element-based finite-volume method in fully coupled and semi coupled algorithms. A major drawback with the implicit methods is the need for solving the huge set of linear algebraic equations in large scale problems. The current parallel computation is developed on distributed memory machines. The matrix decomposition and solution are carried out using PETSc library. In the fully coupled algorithm, there is a 36-diagonal global matrix for the two-dimensional governing equations. In order to reduce the computational time, the matrix is suitably broken in several sub-matrices and they are subsequently solved in a segregated manner. This approach results in four 9-diagonal matrices. Different sparse solver algorithms are utilized to solve a mixed natural-forced convection problem using either fully-coupled or semi-coupled algorithms. The performance of the solvers are then investigated in solving on a distributed computing environment. The study shows that the iteration run time considerably decreases although the overall run time of the fully coupled algorithm still looks better.

Download Full-text

WAVELET-BASED POSTPROCESSING OF JET LES DATA FOR ACOUSTIC FAR-FIELD EXTRAPOLATIONS

Journal of Computational Acoustics ◽

10.1142/s0218396x13500173 ◽

2013 ◽

Vol 21 (04) ◽

pp. 1350017

Author(s):

RAMIN KAVIANI ◽

VAHID ESFAHANIAN ◽

MOHAMMAD EBRAHIMI

Keyword(s):

Large Scale ◽

Stokes Equations ◽

Near Field ◽

Flow Simulation ◽

Computational Time ◽

Jet Flows ◽

Small Scale ◽

Frequency Noise ◽

Far Field ◽

Noise Sources

The affordable grid resolutions in conventional large-eddy simulations (LESs) of high Reynolds jet flows are unable to capture the sound generated by fluid motions near and beyond the grid cut-off scale. As a result, the frequency spectrum of the extrapolated sound field is artificially truncated at high frequencies. In this paper, a new method is proposed to account for the high frequency noise sources beyond the resolution of a compressible flow simulation. The large-scale turbulent structures as dominant radiators of sound are captured in LES, satisfying filtered Navier–Stokes equations, while for small-scale turbulence, a Kolmogorov's turbulence spectrum is imposed. The latter is performed via a wavelet-based extrapolation to add randomly generated small-scale noise sources to the LES near-field data. Further, the vorticity and instability waves are filtered out via a passive wavelet-based masking and the whole spectrum of filtered data are captured on a Ffowcs-Williams/Hawkings (FW-H) surface surrounding the near-field region and are projected to acoustic far-field. The algorithm can be implemented as a separate postprocessing stage and it is observed that the computational time is considerably reduced utilizing a hybrid of many-core and multi-core framework, i.e. MPI-CUDA programming. The comparison of the results obtained from this procedure and those from experiments for high subsonic and transonic jets, shows that the far-field noise spectrum agree well up to 2 times of the grid cut-off frequency.

Download Full-text

A Benchmark Comparison of Coupled and Segregated Iterative SUPG Finite Element Formulation for Incompressible Flow

Volume 3: 26th Computers and Information in Engineering Conference ◽

10.1115/detc2006-99694 ◽

2006 ◽

Author(s):

Jianhui Xie ◽

R. S. Amano

Keyword(s):

Finite Element ◽

Large Scale ◽

Stokes Equations ◽

Three Dimensional ◽

Cost Effective ◽

Element Formulation ◽

Application Problem ◽

The Individual ◽

Solution Efficiency ◽

Fully Coupled

In fluid flow and heat transfer, the finite element based fully coupling solution for all conservation equations is cost effective for most of the two dimensional, isothermal problems, but suffers in the storage and solution efficiency for large three dimensional problems. The segregated solution algorithm has been designed to address large scale simulation with avoiding the direct formulation of a global matrix. There is trade-off between performing a large number of less expensive iterations by segregated solvers compared to less number of more expensive fully coupled solvers. In this paper, a Finite Element based scheme based on preconditioned GMRES coupled algorithm and SUPG (Streamline Upwind Petrov-Galerkin) pressure prediction/correction segregated formulations have been discussed to solve the steady Navier-Stokes equations. A systematic comparison and benchmark between the segregated and fully coupled formulation has been presented to evaluate the individual benefits and strengths of the coupling and segregated procedure by studying lid-driven cavity problem and large industry application problem with respect to the system storage and solution convergence.

Download Full-text

A note on adaptivity in factorized approximate inverse preconditioning

Analele Universitatii Ovidius Constanta - Seria Matematica ◽

10.2478/auom-2020-0024 ◽

2020 ◽

Vol 28 (2) ◽

pp. 149-159

Author(s):

Jiří Kopal ◽

Miroslav Rozložník ◽

Miroslav Tůma

Keyword(s):

Large Scale ◽

Condition Numbers ◽

Algebraic Equations ◽

Approximate Factorization ◽

Approximate Inverse ◽

Linear Algebraic Equations ◽

Wide Range ◽

The Matrix ◽

Preconditioned Iterative ◽

Principal Submatrices

AbstractThe problem of solving large-scale systems of linear algebraic equations arises in a wide range of applications. In many cases the preconditioned iterative method is a method of choice. This paper deals with the approximate inverse preconditioning AINV/SAINV based on the incomplete generalized Gram–Schmidt process. This type of the approximate inverse preconditioning has been repeatedly used for matrix diagonalization in computation of electronic structures but approximating inverses is of an interest in parallel computations in general. Our approach uses adaptive dropping of the matrix entries with the control based on the computed intermediate quantities. Strategy has been introduced as a way to solve di cult application problems and it is motivated by recent theoretical results on the loss of orthogonality in the generalized Gram– Schmidt process. Nevertheless, there are more aspects of the approach that need to be better understood. The diagonal pivoting based on a rough estimation of condition numbers of leading principal submatrices can sometimes provide inefficient preconditioners. This short study proposes another type of pivoting, namely the pivoting that exploits incremental condition estimation based on monitoring both direct and inverse factors of the approximate factorization. Such pivoting remains rather cheap and it can provide in many cases more reliable preconditioner. Numerical examples from real-world problems, small enough to enable a full analysis, are used to illustrate the potential gains of the new approach.

Download Full-text

MODELLING WAVE-INDUCED RESIDUAL PORE PRESSURE AND DEFORMATION OF SAND FOUNDATIONS UNDERNEATH CAISSON BREAKWATERS

Coastal Engineering Proceedings ◽

10.9753/icce.v33.structures.56 ◽

2012 ◽

Vol 1 (33) ◽

pp. 56

Author(s):

Hisham El Safti ◽

Matthias Kudella ◽

Hocine Oumeraci

Keyword(s):

Large Scale ◽

Pore Fluid ◽

Model Verification ◽

Computational Time ◽

Governing Equations ◽

Surface Plasticity ◽

Preliminary Validation ◽

Fluid Convection ◽

Wave Induced ◽

Fully Coupled

A finite volume model is developed for modelling the behaviour of the seabed underneath monolithic breakwaters. The fully coupled and fully dynamic Biot’s governing equations are solved in a segregated approach. Two simplifications to the governing equations are presented and tested: (i) the pore fluid acceleration is completely neglected (the u-p approximation) and (ii) only the convective part is neglected. It is found that neglecting the pore fluid convection does not reduce the computational time for the presented model. Verification of the model results with the analytical solution of the quasi-static equations is presented. A multi-yield surface plasticity model is implemented in the model to simulate the foundation behaviour under cyclic loads. Preliminary validation of the model with large-scale physical model data is presented.

Download Full-text

Prediction of Non-Equilibrium Heat Conduction Using Parallel Computation of the Phonon Boltzmann Transport Equation

Volume 8B: Heat Transfer and Thermal Engineering ◽

10.1115/imece2014-36084 ◽

2014 ◽

Author(s):

Syed A. Ali ◽

Gautham Kollu ◽

Sandip Mazumder ◽

P. Sadayappan

Keyword(s):

Heat Conduction ◽

Transport Equation ◽

Parallel Computation ◽

Large Scale ◽

Phonon Scattering ◽

Boltzmann Transport Equation ◽

Computational Time ◽

Spectral Space ◽

Boltzmann Transport ◽

Non Equilibrium

Non-equilibrium heat conduction, as occurring in modern-day sub-micron semiconductor devices, can be predicted effectively using the Boltzmann Transport Equation (BTE) for phonons. In this article, strategies and algorithms for large-scale parallel computation of the phonon BTE are presented. An unstructured finite volume method for spatial discretization is coupled with the control angle discrete ordinates method for angular discretization. The single-time relaxation approximation is used to treat phonon-phonon scattering. Both dispersion and polarization of the phonons are accounted for. Three different parallelization strategies are explored: (a) band-based, (b) direction-based, and (c) hybrid band/cell-based. Subsequent to validation studies in which silicon thin-film thermal conductivity was successfully predicted, transient simulations of non-equilibrium thermal transport were conducted in a three-dimensional device-like silicon structure, discretized using 604,054 tetrahedral cells. The angular space was discretized using 400 angles, and the spectral space was discretized into 40 spectral intervals (bands). This resulted in ∼9.7×109 unknowns, which are approximately 3 orders of magnitude larger than previously reported computations in this area. Studies showed that direction-based and hybrid band/cell-based parallelization strategies resulted in similar total computational time. However, the parallel efficiency of the hybrid band/cell-based strategy — about 88% — was found to be superior to that of the direction-based strategy, and is recommended as the preferred strategy for even larger scale computations.

Download Full-text

Compiler Technology for Parallel Scientific Computation

Scientific Programming ◽

10.1155/1994/243495 ◽

1994 ◽

Vol 3 (3) ◽

pp. 201-225 ◽

Cited By ~ 2

Author(s):

Can Özturan ◽

Balaram Sinharoy ◽

Boleslaw K. Szymanski

Keyword(s):

Parallel Computation ◽

Programming Language ◽

Code Generation ◽

Large Scale ◽

User Involvement ◽

Scientific Computation ◽

Data Alignment ◽

Source Program ◽

Parallel Code ◽

Run Time

There is a need for compiler technology that, given the source program, will generate efficient parallel codes for different architectures with minimal user involvement. Parallel computation is becoming indispensable in solving large-scale problems in science and engineering. Yet, the use of parallel computation is limited by the high costs of developing the needed software. To overcome this difficulty we advocate a comprehensive approach to the development of scalable architecture-independent software for scientific computation based on our experience with equational programming language (EPL). Our approach is based on a program decomposition, parallel code synthesis, and run-time support for parallel scientific computation. The program decomposition is guided by the source program annotations provided by the user. The synthesis of parallel code is based on configurations that describe the overall computation as a set of interacting components. Run-time support is provided by the compiler-generated code that redistributes computation and data during object program execution. The generated parallel code is optimized using techniques of data alignment, operator placement, wavefront determination, and memory optimization. In this article we discuss annotations, configurations, parallel code generation, and run-time support suitable for parallel programs written in the functional parallel programming language EPL and in Fortran.

Download Full-text

Seismic reflectivity inversion using an L1-norm basis-pursuit method and GPU parallelisation

Journal of Geophysics and Engineering ◽

10.1093/jge/gxaa029 ◽

2020 ◽

Author(s):

Ruo Wang ◽

Yanghua Wang ◽

Ying Rao

Keyword(s):

Large Scale ◽

Seismic Inversion ◽

Computational Time ◽

Basis Pursuit ◽

Inversion Problem ◽

Inversion Algorithm ◽

L1 Norm ◽

The Matrix ◽

Seismic Reflectivity ◽

Reflectivity Inversion

Abstract Seismic reflectivity inversion problem can be formulated using a basis-pursuit method, aiming to generate a sparse reflectivity series of the subsurface media. In the basis-pursuit method, the reflectivity series is composed by large amounts of even and odd dipoles, thus the size of the seismic response matrix is huge and the matrix operations involved in seismic inversion are very time-consuming. In order to accelerate the matrix computation, a basis-pursuit method-based seismic inversion algorithm is implemented on Graphics Processing Unit (GPU). In the basis-persuit inversion algorithm, the problem is imposed with a L1-norm model constraint for sparsity, and this L1-norm basis-pursuit inversion problem is reformulated using a linear programming method. The core problems in the inversion are large-scale linear systems, which are resolved by a parallelised conjugate gradient method. The performance of this fully parallelised implementation is evaluated and compared to the conventional serial coding. Specifically, the investigation using several field seismic data sets with different sizes indicates that GPU-based parallelisation can significantly reduce the computational time with an overall factor up to 145. This efficiency improvement demonstrates a great potential of the basis-pursuit inversion method in practical application to large-scale seismic reflectivity inversion problems.

Download Full-text

Acceleration of the calculations for solving large problems using the boundary element method for the Stokes equations on GPU

Proceedings of the Mavlyutov Institute of Mechanics ◽

10.21662/uim2011.1.018 ◽

2011 ◽

Vol 8 (1) ◽

pp. 189-197

Author(s):

O.A. Solnyshkina

Keyword(s):

Boundary Element Method ◽

Boundary Element ◽

Large Scale ◽

Stokes Equations ◽

Fluid Motion ◽

Creeping Flow ◽

Graphics Processors ◽

The Matrix ◽

Matrix Vector ◽

Element Method

A creeping flow of a viscous fluid in a channel in 3D formulation is considered. The fluid motion is described by the Stokes equations. The problem is solved numerically using the boundary element method. The obtained results are compared with the analytical solution. To accelerate the calculations for solving large-scale problems, the software component of the matrix-vector product is developed and parallelized on the graphics processors. The paper presents the results of the GPU utilization for the considered problems.

Download Full-text

Combustion Driven by Fragment-based Ab Initio Molecular Dynamics Simulation

10.26434/chemrxiv.11462160.v1 ◽

2019 ◽

Author(s):

Liqun Cao ◽

Jinzhe Zeng ◽

Mingyuan Xu ◽

Chih-Hao Chin ◽

Tong Zhu ◽

...

Keyword(s):

Molecular Dynamics ◽

Ab Initio ◽

Large Scale ◽

Methane Combustion ◽

Combustion Method ◽

Dynamics Simulation ◽

Ab Initio Molecular Dynamics ◽

Computational Time ◽

Combustion Mechanism ◽

Daily Lives

Combustion is a kind of important reaction that affects people's daily lives and the development of aerospace. Exploring the reaction mechanism contributes to the understanding of combustion and the more efficient use of fuels. Ab initio quantum mechanical (QM) calculation is precise but limited by its computational time for large-scale systems. In order to carry out reactive molecular dynamics (MD) simulation for combustion accurately and quickly, we develop the MFCC-combustion method in this study, which calculates the interaction between atoms using QM method at the level of MN15/6-31G(d). Each molecule in systems is treated as a fragment, and when the distance between any two atoms in different molecules is greater than 3.5 Å, a new fragment involved two molecules is produced in order to consider the two-body interaction. The deviations of MFCC-combustion from full system calculations are within a few kcal/mol, and the result clearly shows that the calculated energies of the different systems using MFCC-combustion are close to converging after the distance thresholds are larger than 3.5 Å for the two-body QM interactions. The methane combustion was studied with the MFCC-combustion method to explore the combustion mechanism of the methane-oxygen system.

Download Full-text

Comparison of LBM and FVM in the estimation of LAD stenosis

Proceedings of the Institution of Mechanical Engineers Part H Journal of Engineering in Medicine ◽

10.1177/09544119211016912 ◽

2021 ◽

pp. 095441192110169

Author(s):

Jian Liu ◽

Yong Yu ◽

Chenqi Zhu ◽

Yu Zhang

Keyword(s):

Stokes Equations ◽

Flow Simulation ◽

Diagnostic Methods ◽

Computational Time ◽

Speed Ratio ◽

Maximum Speed ◽

Fractional Flow ◽

Local Flow ◽

Non Invasive ◽

Flow Reserve

The finite volume method (FVM)-based computational fluid dynamics (CFD) technology has been applied in the non-invasive diagnosis of coronary artery stenosis. Nonetheless, FVM is a time-consuming process. In addition to FVM, the lattice Boltzmann method (LBM) is used in fluid flow simulation. Unlike FVM solving the Navier–Stokes equations, LBM directly solves the simplified Boltzmann equation, thus saving computational time. In this study, 12 patients with left anterior descending (LAD) stenosis, diagnosed by CTA, are analysed using FVM and LBM. The velocities, pressures, and wall shear stress (WSS) predicted using FVM and LBM for each patient is compared. In particular, the ratio of the average and maximum speed at the stenotic part characterising the degree of stenosis is compared. Finally, the golden standard of LAD stenosis, invasive fractional flow reserve (FFR), is applied to justify the simulation results. Our results show that LBM and FVM are consistent in blood flow simulation. In the region with a high degree of stenosis, the local flow patterns in those two solvers are slightly different, resulting in minor differences in local WSS estimation and blood speed ratio estimation. Notably, these differences do not result in an inconsistent estimation. Comparison with invasive FFR shows that, in most cases, the non-invasive diagnosis is consistent with FFR measurements. However, in some cases, the non-invasive diagnosis either underestimates or overestimates the degree of stenosis. This deviation is caused by the difference between physiological and simulation conditions that remains the biggest challenge faced by all CFD-based non-invasive diagnostic methods.

Download Full-text