tridiagonal systems
Recently Published Documents


TOTAL DOCUMENTS

134
(FIVE YEARS 3)

H-INDEX

15
(FIVE YEARS 0)

2021 ◽  
Vol 37 ◽  
pp. 434-491
Author(s):  
Kazumasa Nomura ◽  
Paul Terwilliger

There is a concept in linear algebra called a tridiagonal pair. The concept was motivated by the theory of $Q$-polynomial distance-regular graphs. We give a tutorial introduction to tridiagonal pairs, working with a special case as a concrete example. The special case is called totally bipartite, or totally bipartite (TB). Starting from first principles, we give an elementary but comprehensive account of TB tridiagonal pairs. The following topics are discussed: (i) the notion of a TB tridiagonal system; (ii) the eigenvalue array; (iii) the standard basis and matrix representations; (iv) the intersection numbers; (v) the Askey--Wilson relations; (vi) a recurrence involving the eigenvalue array; (vii) the classification of TB tridiagonal systems; (viii) self-dual TB tridiagonal pairs and systems; (ix) the $\mathbb{Z}_3$-symmetric Askey--Wilson relations; (x) some automorphisms and antiautomorphisms associated with a TB tridiagonal pair; and (xi) an action of the modular group ${\rm PSL}_2(\mathbb{Z})$ associated with a TB tridiagonal pair.



2021 ◽  
Vol 260 ◽  
pp. 107722
Author(s):  
Ki-Ha Kim ◽  
Ji-Hoon Kang ◽  
Xiaomin Pan ◽  
Jung-Il Choi
Keyword(s):  


2021 ◽  
Vol 54 (7) ◽  
pp. 821-826
Author(s):  
Aleksandr Y. Aravkin ◽  
James V. Burke ◽  
Bradley M. Bell ◽  
Gianluigi Pillonetto


Author(s):  
Piotr Zacharzewski ◽  
Richard Jefferson-Loveday

Abstract Flow as well as geometry inside turbomachinery components such as turbine blades is complex and difficult to handle accurately. Computationally affordable Reynolds Averaged Navier Stokes (RANS) simulations are often not suitable and partly resolving simulations such as Large Eddy Simulation (LES) or hybrid RANS-LES are needed for sufficient accuracy in the area. Within industrial turbine design, these are not deployed routinely, if at all, due to their presently unaffordable computational cost and time-consuming grid generation for complex geometries. General Purpose Graphic Processing Units (GPGPUs) and other modern heterogeneous hardware offer much cheaper computational power, however, so far remain mostly unharnessed in the field of CFD due to difficulty of creating structured datasets required to utilise the GPUs effectively. While unstructured or hybrid grids can be used on massively parallel platforms, the typically irregular memory access patterns they demand usually prohibits effective scaling and GPU remains mostly idle, negating the benefits. Within CFD, structured datasets with ordered memory access patterns are most easily obtained with structured multiblock grids and such grids are an excellent candidate for GPU platforms. This is not without challenges as creating high quality structured grids over complex geometries is known to be a highly time consuming and difficult process. Another limitation of GPUs is difficulty of solving tridiagonal systems of equations efficiently on those platforms. Solution of such systems of equations is typically required for implicit time advancement or convergence acceleration techniques such as AMG and it is well established that implicit numerical schemes provide significant computational savings due to their efficiency. In the present work a novel Alternating Direction Implicit (ADI) library is integrated into the CFD system to enable scalable solution of tridiagonal systems on GPUs. In the current paper a GPU-accelerated Immersed Boundary Method (IBM) code is presented and validated for turbo-machinery applications. It is shown that the combination of IBM, a high-level Oxford Parallel library for Structured applications (OPS) and an ADI solver provide the geometric as well as computational flexibility unmatched by traditional unstructured solvers. A single source code exists for major hardware platforms and the parallel implementation is decoupled from the scientific codebase, making the code scalable and easily adaptable to any emerging, future architectures.





Author(s):  
Gennady Shvachych ◽  
Ivan Pobochii ◽  
Tetyana Khokhlova ◽  
Olena Kholod ◽  
Volodymyr Busygin ◽  
...  


2020 ◽  
Vol 41 (4) ◽  
pp. 1546-1570
Author(s):  
Minghua Chen ◽  
Sven-Erik Ekström ◽  
Stefano Serra-Capizzano


2019 ◽  
Vol 2019 ◽  
pp. 1-13
Author(s):  
Hamish J. Macintosh ◽  
Jasmine E. Banks ◽  
Neil A. Kelson

Solving diagonally dominant tridiagonal linear systems is a common problem in scientific high-performance computing (HPC). Furthermore, it is becoming more commonplace for HPC platforms to utilise a heterogeneous combination of computing devices. Whilst it is desirable to design faster implementations of parallel linear system solvers, power consumption concerns are increasing in priority. This work presents the oclspkt routine. The oclspkt routine is a heterogeneous OpenCL implementation of the truncated SPIKE algorithm that can use FPGAs, GPUs, and CPUs to concurrently accelerate the solving of diagonally dominant tridiagonal linear systems. The routine is designed to solve tridiagonal systems of any size and can dynamically allocate optimised workloads to each accelerator in a heterogeneous environment depending on the accelerator’s compute performance. The truncated SPIKE FPGA solver is developed first for optimising OpenCL device kernel performance, global memory bandwidth, and interleaved host to device memory transactions. The FPGA OpenCL kernel code is then refactored and optimised to best exploit the underlying architecture of the CPU and GPU. An optimised TDMA OpenCL kernel is also developed to act as a serial baseline performance comparison for the parallel truncated SPIKE kernel since no FPGA tridiagonal solver capable of solving large tridiagonal systems was available at the time of development. The individual GPU, CPU, and FPGA solvers of the oclspkt routine are 110%, 150%, and 170% faster, respectively, than comparable device-optimised third-party solvers and applicable baselines. Assessing heterogeneous combinations of compute devices, the GPU + FPGA combination is found to have the best compute performance and the FPGA-only configuration is found to have the best overall estimated energy efficiency.



Sign in / Sign up

Export Citation Format

Share Document