Aggregation of clans to speed-up solving linear systems on parallel architectures

Author(s):  
Dmitry A. Zaitsev ◽  
Tatiana R. Shmeleva ◽  
Piotr Luszczek
Algorithms ◽  
2020 ◽  
Vol 13 (4) ◽  
pp. 100 ◽  
Author(s):  
Luca Bergamaschi

The aim of this survey is to review some recent developments in devising efficient preconditioners for sequences of symmetric positive definite (SPD) linear systems A k x k = b k , k = 1 , … arising in many scientific applications, such as discretization of transient Partial Differential Equations (PDEs), solution of eigenvalue problems, (Inexact) Newton methods applied to nonlinear systems, rational Krylov methods for computing a function of a matrix. In this paper, we will analyze a number of techniques of updating a given initial preconditioner by a low-rank matrix with the aim of improving the clustering of eigenvalues around 1, in order to speed-up the convergence of the Preconditioned Conjugate Gradient (PCG) method. We will also review some techniques to efficiently approximate the linearly independent vectors which constitute the low-rank corrections and whose choice is crucial for the effectiveness of the approach. Numerical results on real-life applications show that the performance of a given iterative solver can be very much enhanced by the use of low-rank updates.


2012 ◽  
Vol 2012 ◽  
pp. 1-9 ◽  
Author(s):  
Shi-Liang Wu ◽  
Cui-Xia Li

The finite difference method discretization of Helmholtz equations usually leads to the large spare linear systems. Since the coefficient matrix is frequently indefinite, it is difficult to solve iteratively. In this paper, a modified symmetric successive overrelaxation (MSSOR) preconditioning strategy is constructed based on the coefficient matrix and employed to speed up the convergence rate of iterative methods. The idea is to increase the values of diagonal elements of the coefficient matrix to obtain better preconditioners for the original linear systems. Compared with SSOR preconditioner, MSSOR preconditioner has no additional computational cost to improve the convergence rate of iterative methods. Numerical results demonstrate that this method can reduce both the number of iterations and the computational time significantly with low cost for construction and implementation of preconditioners.


SPE Journal ◽  
2018 ◽  
Vol 23 (02) ◽  
pp. 589-597 ◽  
Author(s):  
Sebastian Gries

Summary System-algebraic multigrid (AMG) provides a flexible framework for linear systems in simulation applications that involve various types of physical unknowns. Reservoir-simulation applications, with their driving elliptic pressure unknown, are principally well-suited to exploit System-AMG as a robust and efficient solver method. However, the coarse grid correction must be physically meaningful to speed up the overall convergence. It has been common practice in constrained-pressure-residual (CPR) -type applications to use an approximate pressure/saturation decoupling to fulfill this requirement. Unfortunately, this can have significant effects on the AMG applicability and, thus, is not performed by the dynamic row-sum (DRS) method. This work shows that the pressure/saturation decoupling is not necessary for ensuring an efficient interplay between the coarse grid correction process and the fine-level problem, demonstrating that a comparable influence of the pressure on the different involved partial-differential equations (PDEs) is much more crucial. As an extreme case with respect to the outlined requirement, linear systems from compositional simulations under the volume-balance formulation will be discussed. In these systems, the pressure typically is associated with a volume balance rather than a diffusion process. It will be shown how System-AMG can still be used in such cases.


2018 ◽  
Vol 18 (3) ◽  
pp. 449
Author(s):  
Thiago Nascimento Rodrigues ◽  
Maria Claudia Silva Boeres ◽  
Lucia Catabriga

The Reverse Cuthill-McKee (RCM) algorithm is a well-known heuristicfor reordering sparse matrices. It is typically used to speed up the computation ofsparse linear systems of equations. This paper describes two parallel approachesfor the RCM algorithm as well as an optimized version of each one based on someproposed enhancements. The first one exploits a strategy for reducing lazy threads,while the second one makes use of a static bucket array as the main data structureand suppress some steps performed by the original algorithm. These related changesled to outstanding reordering time results and significant bandwidth reductions.The performance of two algorithms is compared with the respective implementationmade available by Boost library. The OpenMP framework is used for supportingthe parallelism and both versions of the algorithm are tested with large sparse andstructural symmetric matrices.


10.29007/98fh ◽  
2018 ◽  
Author(s):  
Severin Neumann

In applications of symbolic computation an often required but complex procedure is the computation of Gröbner bases and hence it is obviousto realize parallel algorithms to compute them. There are parallel flavours of the F4 algorithm using the special structure of the occurring matricesto speed up the reduction. In this paper we start from this and present modifications allowing efficient computations of Gröbner bases on parallel architecturesusing shared as well as distributed memory. To achieve this we concentrate on one objective: reducing the memory consumption and avoiding communication overhead.We remove unrequired steps of the reduction, split the columns of the matrix in blocks for distribution and review the effectiveness of the SIMPLIFY function.Finally we provide benchmarks with up to 256 distributed threads of an implementation which will be available at https://github.com/svrnm/parallelGBC.


Sign in / Sign up

Export Citation Format

Share Document