Implementing iterative solvers for irregular sparse matrix problems in high performance Fortran

Solvers for large sparse linear systems come in two categories: direct and iterative. Amesos2, a package in the Trilinos software project, provides direct methods, and Belos, another Trilinos package, provides iterative methods. Amesos2 offers a common interface to many different sparse matrix factorization codes, and can handle any implementation of sparse matrices and vectors, via an easy-to-extend C++ traits interface. It can also factor matrices whose entries have arbitrary “Scalar” type, enabling extended-precision and mixed-precision algorithms. Belos includes many different iterative methods for solving large sparse linear systems and least-squares problems. Unlike competing iterative solver libraries, Belos completely decouples the algorithms from the implementations of the underlying linear algebra objects. This lets Belos exploit the latest hardware without changes to the code. Belos favors algorithms that solve higher-level problems, such as multiple simultaneous linear systems and sequences of related linear systems, faster than standard algorithms. The package also supports extended-precision and mixed-precision algorithms. Together, Amesos2 and Belos form a complete suite of sparse linear solvers.

Download Full-text

Scientific Programming with High Performance Fortran: A Case Study Using the xHPF Compiler

Scientific Programming ◽

10.1155/1997/528513 ◽

1997 ◽

Vol 6 (1) ◽

pp. 127-152

Author(s):

Eric De Sturler ◽

Volker Strumpen

Keyword(s):

High Performance ◽

Parallel Implementation ◽

Gaussian Elimination ◽

Primary Objective ◽

Matrix Product ◽

Dense Matrix ◽

High Performance Fortran ◽

Partial Pivoting ◽

Intel Paragon

Recently, the first commercial High Performance Fortran (HPF) subset compilers have appeared. This article reports on our experiences with the xHPF compiler of Applied Parallel Research, version 1.2, for the Intel Paragon. At this stage, we do not expect very High Performance from our HPF programs, even though performance will eventually be of paramount importance for the acceptance of HPF. Instead, our primary objective is to study how to convert large Fortran 77 (F77) programs to HPF such that the compiler generates reasonably efficient parallel code. We report on a case study that identifies several problems when parallelizing code with HPF; most of these problems affect current HPF compiler technology in general, although some are specific for the xHPF compiler. We discuss our solutions from the perspective of the scientific programmer, and presenttiming results on the Intel Paragon. The case study comprises three programs of different complexity with respect to parallelization. We use the dense matrix-matrix product to show that the distribution of arrays and the order of nested loops significantly influence the performance of the parallel program. We use Gaussian elimination with partial pivoting to study the parallelization strategy of the compiler. There are various ways to structure this algorithm for a particular data distribution. This example shows how much effort may be demanded from the programmer to support the compiler in generating an efficient parallel implementation. Finally, we use a small application to show that the more complicated structure of a larger program may introduce problems for the parallelization, even though all subroutines of the application are easy to parallelize by themselves. The application consists of a finite volume discretization on a structured grid and a nested iterative solver. Our case study shows that it is possible to obtain reasonably efficient parallel programs with xHPF, although the compiler needs substantial support from the programmer.

Download Full-text

Three-Dimensional Electromagnetic Particle-in-Cell Code Using High Performance Fortran on PC Cluster

Lecture Notes in Computer Science - High Performance Computing ◽

10.1007/3-540-47847-7_48 ◽

2002 ◽

pp. 515-525 ◽

Cited By ~ 1

Author(s):

DongSheng Cai ◽

Yaoting Li ◽

Ken-ichi Nishikawa ◽

Chiejie Xiao ◽

Xiaoyan Yan

Keyword(s):

High Performance ◽

Three Dimensional ◽

Particle In Cell ◽

Pc Cluster ◽

High Performance Fortran

Download Full-text