A new sparse matrix vector multiplication graphics processing unit algorithm designed for finite element problems

J. Wong; E. Kuhl; E. Darve

doi:10.1002/nme.4865

A new sparse matrix vector multiplication graphics processing unit algorithm designed for finite element problems

International Journal for Numerical Methods in Engineering ◽

10.1002/nme.4865 ◽

2015 ◽

Vol 102 (12) ◽

pp. 1784-1814 ◽

Cited By ~ 15

Author(s):

J. Wong ◽

E. Kuhl ◽

E. Darve

Keyword(s):

Finite Element ◽

Graphics Processing Unit ◽

Sparse Matrix ◽

Processing Unit ◽

Matrix Vector Multiplication ◽

Graphics Processing ◽

Matrix Vector

Download Full-text

Fast sparse matrix-vector multiplication on graphics processing unit for finite element analysis

2012 IEEE 14th International Conference on High Performance Computing and Communication & 2012 IEEE 9th International Conference on Embedded Software and Systems ◽

10.1109/hpcc.2012.193 ◽

2012 ◽

Cited By ~ 11

Author(s):

Abal-Kassim Cheik Ahamed ◽

Frederic Magoules

Keyword(s):

Finite Element Analysis ◽

Finite Element ◽

Graphics Processing Unit ◽

Sparse Matrix ◽

Processing Unit ◽

Element Analysis ◽

Matrix Vector Multiplication ◽

Graphics Processing ◽

Matrix Vector

Download Full-text

GPU Accelerated Reconstruction in Compton Scattering Tomography Using Matrix Compression

Applied Mechanics and Materials ◽

10.4028/www.scientific.net/amm.519-520.102 ◽

2014 ◽

Vol 519-520 ◽

pp. 102-107

Author(s):

Yu Fei Yu ◽

Bin Yan ◽

Biao Wang ◽

Lei Li ◽

Yu Han ◽

...

Keyword(s):

Compton Scattering ◽

Graphics Processing Unit ◽

Sparse Matrix ◽

Reconstruction Algorithm ◽

Processing Unit ◽

Matrix Vector Multiplication ◽

Speedup Ratio ◽

Parallel Features ◽

Graphics Processing ◽

Matrix Vector

An acceleration strategy for TV-ADM reconstruction algorithm in Compton scattering tomography (CST) is proposed. By analyzing the sparse characteristic of CST projection matrixes, firstly, the sparse matrix vector CSR format and ELL format are used to store them, which greatly reduce the memory consumption. Then, a Sparse Matrix Vector multiplication (SpMV) method is utilized to accelerate the projector and back projector process. Finally, based on the parallel features, the TV-ADM is computed with Graphics Processing Unit (GPU). Numerical experiments show that the TV-ADM with the presented acceleration strategy could achieve a 96 times speedup ratio and 224 times memory compression ratio without precision loss.

Download Full-text

Iterative sparse matrix-vector multiplication for accelerating the block Wiedemann algorithm over GF(2) on multi-graphics processing unit systems

Concurrency and Computation Practice and Experience ◽

10.1002/cpe.2896 ◽

2012 ◽

Vol 25 (4) ◽

pp. 586-603 ◽

Cited By ~ 4

Author(s):

Bertil Schmidt ◽

Hans Aribowo ◽

Hoang-Vu Dang

Keyword(s):

Graphics Processing Unit ◽

Sparse Matrix ◽

Processing Unit ◽

Matrix Vector Multiplication ◽

Graphics Processing ◽

Matrix Vector

Download Full-text

A novel multi-graphics processing unit parallel optimization framework for the sparse matrix-vector multiplication

Concurrency and Computation Practice and Experience ◽

10.1002/cpe.3936 ◽

2016 ◽

Vol 29 (5) ◽

pp. e3936 ◽

Cited By ~ 10

Author(s):

Jiaquan Gao ◽

Yu Wang ◽

Jun Wang

Keyword(s):

Graphics Processing Unit ◽

Sparse Matrix ◽

Parallel Optimization ◽

Processing Unit ◽

Optimization Framework ◽

Matrix Vector Multiplication ◽

Graphics Processing ◽

Matrix Vector

Download Full-text

A new diagonal storage for efficient implementation of sparse matrix–vector multiplication on graphics processing unit

Concurrency and Computation Practice and Experience ◽

10.1002/cpe.6230 ◽

2021 ◽

Author(s):

Guixia He ◽

Qi Chen ◽

Jiaquan Gao

Keyword(s):

Graphics Processing Unit ◽

Sparse Matrix ◽

Efficient Implementation ◽

Processing Unit ◽

Matrix Vector Multiplication ◽

Graphics Processing ◽

Matrix Vector

Download Full-text

Finite element method completely implemented for graphic processor units using parallel algorithm libraries

The International Journal of High Performance Computing Applications ◽

10.1177/1094342017694703 ◽

2017 ◽

Vol 33 (1) ◽

pp. 53-66 ◽

Cited By ~ 1

Author(s):

Franz Pichler ◽

Gundolf Haase

Keyword(s):

Finite Element ◽

Graphics Processing Unit ◽

Computational Cost ◽

Processing Unit ◽

Time Step ◽

Device Architecture ◽

Transient Problems ◽

Speed Up ◽

Automotive Batteries ◽

Graphics Processing

A finite element code is developed in which all of the computationally expensive steps are performed on a graphics processing unit via the THRUST and the PARALUTION libraries. The code focuses on the simulation of transient problems where the repeated computations per time-step create the computational cost. It is used to solve partial and ordinary differential equations as they arise in thermal-runaway simulations of automotive batteries. The speed-up obtained by utilizing the graphics processing unit for every critical step is compared against the single core and the multi-threading solutions which are also supported by the chosen libraries. This way a high total speed-up on the graphics processing unit is achieved without the need for programming a single classical Compute Unified Device Architecture kernel.

Download Full-text

A three‐stage graphics processing unit‐based finite element analyses matrix generation strategy for unstructured meshes

International Journal for Numerical Methods in Engineering ◽

10.1002/nme.6383 ◽

2020 ◽

Vol 121 (17) ◽

pp. 3824-3848 ◽

Cited By ~ 1

Author(s):

Subhajit Sanfui ◽

Deepak Sharma

Keyword(s):

Finite Element ◽

Graphics Processing Unit ◽

Unstructured Meshes ◽

Processing Unit ◽

Finite Element Analyses ◽

Graphics Processing

Download Full-text

GPU-Friendly Preconditioners for Efficient 3-D Finite Element Analysis of Thin Structures

Volume 2: 31st Computers and Information in Engineering Conference, Parts A and B ◽

10.1115/detc2011-47330 ◽

2011 ◽

Cited By ~ 1

Author(s):

Vikalp Mishra ◽

Krishnan Suresh

Keyword(s):

Finite Element Analysis ◽

Finite Element ◽

Sparse Matrix ◽

Grid Method ◽

Double Precision ◽

Thin Structures ◽

Element Analysis ◽

Dual Representation ◽

Matrix Vector Multiplication ◽

Matrix Vector

A serious computational bottle-neck in finite element analysis today is the solution of the underlying system of equations. To alleviate this problem, researchers have proposed the use of graphics programmable units (GPU) for fast iterative solution of such equations. Indeed, researchers have shown that a GPU-implementation of a double-precision sparse-matrix-vector multiplication (that underlies all iterative methods) is approximately an order of magnitude faster than that of an optimized CPU implementation. Unfortunately, fast matrix-vector multiplication alone is insufficient… a good preconditioner is necessary for rapid convergence. Furthermore, most modern preconditioners, such as incomplete Cholesky, are expensive to compute, and cannot be easily ported to the GPU. In this paper, we propose a special class of preconditioners for the analysis of thin structures, such as beams and plates. The proposed preconditioners are developed by combining the multi-grid method, with recently developed dual-representation method for thin structures. It is shown, that these preconditioners are computationally inexpensive, perform better than standard pre-conditioners, and can be easily ported to the GPU.

Download Full-text

ACCELERATION OF FINITE ELEMENT COMPUTATION FOR SEISMIC WAVE PROPAGATION USING GRAPHICS PROCESSING UNIT

AIJ Journal of Technology and Design ◽

10.3130/aijt.19.1219 ◽

2013 ◽

Vol 19 (43) ◽

pp. 1219-1224

Author(s):

Kensuke WADA ◽

Shoichi NAKAI ◽

Toru SEKIGUCHI

Keyword(s):

Finite Element ◽

Wave Propagation ◽

Seismic Wave ◽

Graphics Processing Unit ◽

Seismic Wave Propagation ◽

Processing Unit ◽

Element Computation ◽

Graphics Processing ◽

Finite Element Computation

Download Full-text

Parallel computations of the step response of a floor heater with the use of a graphics processing unit. Part 2: results and their evaluation

Bulletin of the Polish Academy of Sciences Technical Sciences ◽

10.2478/bpasts-2013-0102 ◽

2013 ◽

Vol 61 (4) ◽

pp. 949-954 ◽

Cited By ~ 1

Author(s):

J. Gołębiowski ◽

J. Forenc

Keyword(s):

Graphics Processing Unit ◽

Sparse Matrix ◽

Temporal Distribution ◽

Step Response ◽

Processing Unit ◽

Commercial Program ◽

Speed Up ◽

Spatio Temporal ◽

Graphics Processing ◽

Linear Systems Of Equations

Abstract Using models and algorithms presented in the first part of the article, a spatio-temporal distribution of the step response of a floor heater was determined. The results have been presented in the form of heating curves and temperature profiles of the heater in the selected time moments. The computations results were verified through comparing them with the solution obtained with the use of a commercial program - NISA. Additionally, the distribution of the average time constant of thermal processes occurring in the heater was determined. The analysis of the use of a graphics processing unit in numerical computations based on the conjugate gradient method was done. It was proved that the use of a graphics processing unit is profitable in the case of solving linear systems of equations with dense coefficient matrices. In the case of a sparse matrix, the speed-up depends on the number of its non-zero elements.

Download Full-text