direct solvers Latest Research Papers

Basic Linear Algebra Subroutines (BLAS) are well-known low-level workhorse subroutines for linear algebra vector-vector, matrixvector and matrix-matrix operations for full rank matrices. The advent of block low rank (Rk) full wave direct solvers, where most blocks of the system matrix are Rk, an extension to the BLAS III matrix-matrix work horse routine is needed due to the agony of Rk addition. This note outlines the problem of BLAS III for Rk LU and solve operations and then outlines an alternative approach, which we will call BLAS IV. This approach utilizes the thrill of Rk matrix-matrix multiply and uses the Adaptive Cross Approximation (ACA) as a methodology to evaluate sums of Rk terms to circumvent the agony of low rank addition.

Download Full-text

Computational Costs of Multi-Frontal Direct Solvers with Analysis-Suitable T-Splines

Symmetry ◽

10.3390/sym12122070 ◽

2020 ◽

Vol 12 (12) ◽

pp. 2070

Author(s):

Anna Paszyńska ◽

Maciej Paszyński

Keyword(s):

Isogeometric Analysis ◽

Minimum Degree ◽

Computational Cost ◽

Building Blocks ◽

Adaptive Grids ◽

Direct Solver ◽

Computational Costs ◽

Direct Solvers ◽

The Matrix ◽

Transfer Problems

In this paper, we consider the computational cost of a multi-frontal direct solver used for the factorization of matrices resulting from a discretization of isogeometric analysis with T-splines, and analysis-suitable T-splines. We start from model projection or model heat transfer problems discretized over two-dimensional meshes, either uniformly refined or refined towards a point or an edge. These grids preserve several symmetries and they are the building blocks of more complicated grids constructed during adaptive isotropic refinement procedures. A large class of computational problems construct meshes refined towards point or edge singularities. We propose an ordering that permutes the matrix in a way that the computational cost of a multi-frontal solver executed on adaptive grids is linear. We show that analysis-suitable T-splines with our ordering, besides having other well-known advantages, also significantly reduce the computational cost of factorization with the multi-frontal direct solver. Namely, the factorization with N T-splines of order p over meshes refined to a point has a linear O(Np4) cost, and the factorization with T-splines on meshes refined to an edge has O(N2pp2) cost. We compare the execution time of the multi-frontal solver with our ordering to the Approximate Minimum Degree (AMD) and Cuthill–McKee orderings available in Octave.

Download Full-text

Pragmatic solvers for 3D Stokes and elasticity problems with heterogeneous coefficients: evaluating modern incomplete LDLT preconditioners

Solid Earth ◽

10.5194/se-11-2031-2020 ◽

2020 ◽

Vol 11 (6) ◽

pp. 2031-2045

Author(s):

Patrick Sanan ◽

Dave A. May ◽

Matthias Bollhöfer ◽

Olaf Schenk

Keyword(s):

Saddle Point ◽

Degrees Of Freedom ◽

Stokes Equations ◽

Mixed Finite Element ◽

Schur Complement ◽

Numerical Linear Algebra ◽

Superior Performance ◽

Weighted Matching ◽

Direct Solvers ◽

Multiple Inclusion

Abstract. The need to solve large saddle point systems within computational Earth sciences is ubiquitous. Physical processes giving rise to these systems include porous flow (the Darcy equations), poroelasticity, elastostatics, and highly viscous flows (the Stokes equations). The numerical solution of saddle point systems is non-trivial since the operators are indefinite. Primary tools to solve such systems are direct solution methods (exact triangular factorization) or approximate block factorization (ABF) preconditioners. While ABF solvers have emerged as the state-of-the-art scalable option, they are invasive solvers requiring splitting of pressure and velocity degrees of freedom, a multigrid hierarchy with tuned transfer operators and smoothers, machinery to construct complex Schur complement preconditioners, and the expertise to select appropriate parameters for a given coefficient regime – they are far from being “black box” solvers. Modern direct solvers, which robustly produce solutions to almost any system, do so at the cost of rapidly growing time and memory requirements for large problems, especially in 3D. Incomplete LDLT (ILDL) factorizations, with symmetric maximum weighted-matching preprocessing, used as preconditioners for Krylov (iterative) methods, have emerged as an efficient means to solve indefinite systems. These methods have been developed within the numerical linear algebra community but have yet to become widely used in applications, despite their practical potential; they can be used whenever a direct solver can, only requiring an assembled operator, yet can offer comparable or superior performance, with the added benefit of having a much lower memory footprint. In comparison to ABF solvers, they only require the specification of a drop tolerance and thus provide an easy-to-use addition to the solver toolkit for practitioners. Here, we present solver experiments employing incomplete LDLT factorization with symmetric maximum weighted-matching preprocessing to precondition operators and compare these to direct solvers and ABF-preconditioned iterative solves. To ensure the comparison study is meaningful for Earth scientists, we utilize matrices arising from two prototypical problems, namely Stokes flow and quasi-static (linear) elasticity, discretized using standard mixed finite-element spaces. Our test suite targets problems with large coefficient discontinuities across non-grid-aligned interfaces, which represent a common challenging-for-solvers scenario in Earth science applications. Our results show that (i) as the coefficient structure is made increasingly challenging, by introducing high contrast and complex topology with a multiple-inclusion benchmark, the ABF solver can break down, becoming less efficient than the ILDL solver before breaking down entirely; (ii) ILDL is robust, with a time to solution that is largely independent of the coefficient topology and mildly dependent on the coefficient contrast; (iii) the time to solution obtained using ILDL is typically faster than that obtained from a direct solve, beyond 105 unknowns; and (iv) ILDL always uses less memory than a direct solve.

Download Full-text

Pragmatic Solvers for 3D Stokes and Elasticity Problems with Heterogeneous Coefficients: Evaluating Modern Incomplete LDLT Preconditioners

10.5194/se-2020-79 ◽

2020 ◽

Author(s):

Patrick Sanan ◽

Dave A. May ◽

Matthias Böllhofer ◽

Olaf Schenk

Keyword(s):

Saddle Point ◽

Degrees Of Freedom ◽

Stokes Equations ◽

Earth Science ◽

Mixed Finite Element ◽

Schur Complement ◽

Numerical Linear Algebra ◽

Porous Flow ◽

Weighted Matching ◽

Direct Solvers

Abstract. The need to solve large saddle point systems within computational Earth sciences is ubiquitous. Physical processes giving rise to these systems include porous flow (the Darcy equations), poroelasticity, elastostatics, and highly viscous flows (the Stokes equations). The numerical solution of saddle point systems is non-trivial since the operators are indefinite. Primary tools to solve such systems are direct solution methods (exact triangular factorization) or Approximate Block Factorization (ABF) preconditioners. While ABF solvers have emerged as the state-of-the-art scalable option, they are invasive solvers requiring splitting of pressure and velocity degrees of freedom, a multigrid hierarchy with tuned transfer operators and smoothers, machinery to construct complex Schur complement preconditioners, and the expertise to select appropriate parameters for a given coefficient regime – they are far from being "black box" solvers. Modern direct solvers, which robustly produce solutions to almost any system, do so at the cost of rapidly growing time and memory requirements for large problems, especially in 3D. Incomplete LDL (ILDL) factorizations, with symmetric maximum weighted matching preprocessing, used as preconditioners for Krylov (iterative) methods, have emerged as an efficient means to solve indefinite systems. These methods have been developed within the numerical linear algebra community but have yet to become widely used in non-trivial applications, despite their practical potential; they can be used whenever a direct solver can, only requiring an assembled operator, yet can offer comparable or superior to performance, with the added benefit of having a much lower memory footprint. In comparison to ABF solvers, they only require the specification of a drop tolerance and thus provide an easy-to-use addition to the solver toolkit for practitioners. Here, we present solver experiments employing incomplete LDL factorization with symmetric maximum weighted matching preprocessing to precondition operators, and compare these to direct solvers and ABF-preconditioned iterative solves. To ensure the comparison study is meaningful for Earth scientists, we utilize matrices arising from two prototypical problems, namely Stokes flow and quasi-static (linear) elasticity, discretized using standard mixed finite element spaces. Our test suite targets problems with large coefficient discontinuities across non-grid-aligned interfaces, which represent a common, challenging-for-solvers, scenario in Earth science applications. Our results show: (i) as the coefficient structure is made increasingly challenging (high contrast, complex topology), the ABF solver can break down, becoming less efficient than the ILDL solver before breaking down entirely; (ii) ILDL is robust, with a time-to-solution that is largely independent of the coefficient topology and mildly dependent on the coefficient contrast; (iii) the time-to-solution obtained using ILDL is typically faster than that obtained from a direct solve, beyond 10^5 unknowns; (iv) ILDL always uses less memory than a direct solve.

Download Full-text