High performance linear algebra package for FORTRAN 90

Adaptive Precision Block-Jacobi for High Performance Preconditioning in the Ginkgo Linear Algebra Software

ACM Transactions on Mathematical Software ◽

10.1145/3441850 ◽

2021 ◽

Vol 47 (2) ◽

pp. 1-28

Author(s):

Goran Flegar ◽

Hartwig Anzt ◽

Terry Cojean ◽

Enrique S. Quintana-Ortí

Keyword(s):

Linear Algebra ◽

Graphics Processing Units ◽

High Performance ◽

Numerical Algorithms ◽

Mixed Precision ◽

Before And After ◽

Memory Accesses ◽

Specialized Hardware ◽

The Individual ◽

Graphics Processing

The use of mixed precision in numerical algorithms is a promising strategy for accelerating scientific applications. In particular, the adoption of specialized hardware and data formats for low-precision arithmetic in high-end GPUs (graphics processing units) has motivated numerous efforts aiming at carefully reducing the working precision in order to speed up the computations. For algorithms whose performance is bound by the memory bandwidth, the idea of compressing its data before (and after) memory accesses has received considerable attention. One idea is to store an approximate operator–like a preconditioner–in lower than working precision hopefully without impacting the algorithm output. We realize the first high-performance implementation of an adaptive precision block-Jacobi preconditioner which selects the precision format used to store the preconditioner data on-the-fly, taking into account the numerical properties of the individual preconditioner blocks. We implement the adaptive block-Jacobi preconditioner as production-ready functionality in the Ginkgo linear algebra library, considering not only the precision formats that are part of the IEEE standard, but also customized formats which optimize the length of the exponent and significand to the characteristics of the preconditioner blocks. Experiments run on a state-of-the-art GPU accelerator show that our implementation offers attractive runtime savings.

Download Full-text

The DEC High Performance Fortran 90 compiler front end

Proceedings Frontiers '95. The Fifth Symposium on the Frontiers of Massively Parallel Computation ◽

10.1109/fmpc.1995.380464 ◽

2002 ◽

Cited By ~ 2

Author(s):

D.B. Loveman

Keyword(s):

High Performance ◽

High Performance Fortran ◽

Front End ◽

Fortran 90

Download Full-text

Profiling high performance dense linear algebra algorithms on multicore architectures for power and energy efficiency

Computer Science - Research and Development ◽

10.1007/s00450-011-0191-z ◽

2011 ◽

Vol 27 (4) ◽

pp. 277-287 ◽

Cited By ~ 17

Author(s):

Hatem Ltaief ◽

Piotr Luszczek ◽

Jack Dongarra

Keyword(s):

Energy Efficiency ◽

Linear Algebra ◽

High Performance ◽

Multicore Architectures ◽

Dense Linear Algebra ◽

Power And Energy

Download Full-text

A survey of power and energy efficient techniques for high performance numerical linear algebra operations

Parallel Computing ◽

10.1016/j.parco.2014.09.001 ◽

2014 ◽

Vol 40 (10) ◽

pp. 559-573 ◽

Cited By ~ 19

Author(s):

Li Tan ◽

Shashank Kothapalli ◽

Longxiang Chen ◽

Omar Hussaini ◽

Ryan Bissiri ◽

...

Keyword(s):

Linear Algebra ◽

Energy Efficient ◽

High Performance ◽

Numerical Linear Algebra ◽

Power And Energy

Download Full-text

A Hybridization Methodology for High-Performance Linear Algebra Software for GPUs

GPU Computing Gems Jade Edition ◽

10.1016/b978-0-12-385963-1.00034-4 ◽

2012 ◽

pp. 473-484 ◽

Cited By ~ 6

Author(s):

Emmanuel Agullo ◽

Cédric Augonnet ◽

Jack Dongarra ◽

Hatem Ltaief ◽

Raymond Namyst ◽

...

Keyword(s):

Linear Algebra ◽

High Performance

Download Full-text

Numerical Linear Algebra for High-Performance Computers

10.1137/1.9780898719611 ◽

1998 ◽

Cited By ~ 201

Author(s):

Jack J. Dongarra ◽

Iain S. Duff ◽

Danny C. Sorensen ◽

Henk A. van der Vorst

Keyword(s):

Linear Algebra ◽

High Performance ◽

Numerical Linear Algebra ◽

High Performance Computers

Download Full-text

Linear algebra libraries for high-performance computers: a personal perspective

IEEE Parallel & Distributed Technology Systems & Applications ◽

10.1109/88.219856 ◽

1993 ◽

Vol 1 (1) ◽

pp. 17-24 ◽

Cited By ~ 3

Author(s):

J. Dongarra

Keyword(s):

Linear Algebra ◽

High Performance ◽

Personal Perspective ◽

High Performance Computers

Download Full-text

A Linear Algebra Framework for Static High Performance Fortran Code Distribution

Scientific Programming ◽

10.1155/1997/195689 ◽

1997 ◽

Vol 6 (1) ◽

pp. 3-27 ◽

Cited By ~ 22

Author(s):

Corinne Ancourt ◽

Fabien Coelho ◽

FranÇois Irigoin ◽

Ronan Keryell

Keyword(s):

Linear Algebra ◽

High Performance ◽

Address Space ◽

Data Parallel ◽

High Performance Fortran ◽

Multiple Data ◽

Fortran Code ◽

Code Distribution ◽

Overlap Analysis ◽

Data Parallel Programming

High Performance Fortran (HPF) was developed to support data parallel programming for single-instruction multiple-data (SIMD) and multiple-instruction multiple-data (MIMD) machines with distributed memory. The programmer is provided a familiar uniform logical address space and specifies the data distribution by directives. The compiler then exploits these directives to allocate arrays in the local memories, to assign computations to elementary processors, and to migrate data between processors when required. We show here that linear algebra is a powerful framework to encode HPF directives and to synthesize distributed code with space-efficient array allocation, tight loop bounds, and vectorized communications forINDEPENDENTloops. The generated code includes traditional optimizations such as guard elimination, message vectorization and aggregation, and overlap analysis. The systematic use of an affine framework makes it possible to prove the compilation scheme correct.

Download Full-text

High Performance Linear Algebra Operations on Reconfigurable Systems

ACM/IEEE SC 2005 Conference (SC'05) ◽

10.1109/sc.2005.31 ◽

2005 ◽

Cited By ~ 17

Author(s):

Ling Zhuo ◽

V.K. Prasanna

Keyword(s):

Linear Algebra ◽

High Performance ◽

Reconfigurable Systems

Download Full-text

LAPACK95 - HIGH PERFORMANCE LINEAR ALGEBRA PACKAGE

Mathematical Modelling and Analysis ◽

10.3846/13926292.2000.9637127 ◽

2000 ◽

Vol 5 (1) ◽

pp. 44-54 ◽

Cited By ~ 2

Author(s):

J. Dongarra ◽

J. Waśniewski

Keyword(s):

Linear Algebra ◽

High Performance ◽

Complex Data ◽

Double Precision ◽

Data Types

LAPACK95 is a set of FORTRAN95 subroutines which interfaces FORTRAN95 with LAPACK. All LAPACK driver subroutines (including expert drivers) and some LAPACK computationals have both generic LAPACK95 interfaces and generic LAPACK77 interfaces. The remaining computationals have only generic LAPACK77 interfaces. In both types of interfaces no distinction is made between single and double precision or between real and complex data types.

Download Full-text