scholarly journals Algorithm 1019: A Task-based Multi-shift QR/QZ Algorithm with Aggressive Early Deflation

2022 ◽  
Vol 48 (1) ◽  
pp. 1-36
Author(s):  
Mirko Myllykoski

The QR algorithm is one of the three phases in the process of computing the eigenvalues and the eigenvectors of a dense nonsymmetric matrix. This paper describes a task-based QR algorithm for reducing an upper Hessenberg matrix to real Schur form. The task-based algorithm also supports generalized eigenvalue problems (QZ algorithm) but this paper concentrates on the standard case. The task-based algorithm adopts previous algorithmic improvements, such as tightly-coupled multi-shifts and Aggressive Early Deflation (AED) , and also incorporates several new ideas that significantly improve the performance. This includes, but is not limited to, the elimination of several synchronization points, the dynamic merging of previously separate computational steps, the shortening and the prioritization of the critical path, and experimental GPU support. The task-based implementation is demonstrated to be multiple times faster than multi-threaded LAPACK and ScaLAPACK in both single-node and multi-node configurations on two different machines based on Intel and AMD CPUs. The implementation is built on top of the StarPU runtime system and is part of the open-source StarNEig library.

2009 ◽  
Vol 16 (1) ◽  
pp. 63-86 ◽  
Author(s):  
D. Steven Mackey ◽  
Niloufer Mackey ◽  
Christian Mehl ◽  
Volker Mehrmann

1968 ◽  
Vol 64 (1) ◽  
pp. 193-202
Author(s):  
Nuretti̇n Y. Ölçer

Recently, through a repeated application of one-dimensional finite integral transforms, Cinelli(1) gave a solution for the temperature distribution in a hollow circular cylinder of finite length. Since no new ideas or techniques are introduced, the extension claimed in (1) with regard to the finite Hankel transform technique employed in the transformation of the radial space variable in the hollow cylinder problem is trivial, in view of well-known works by Sneddon(2) and Tranter (3), to mention a few. The list of the finite Hankel transforms given in (1) for a variety of boundary conditions at r = a and r = b is the result of routine, algebraic manipulations well known from the general theory of eigenvalue problems specialized for the hollow cylinder. In this list a set of seemingly different series expansions is given for the inverse Hankel transform for each combination of boundary conditions at the two radial surfaces. In each case, the two expressions for inversion can readily be shown to be identical to each other when use is made of the frequency equation. One of the inversion forms is therefore unnecessary once the other is given. Furthermore, the general solution as given by equation (54) of Cinelli(1)does not satisfy his boundary conditions (27), (28), (29) and (30), unless these latter are homogeneous.


Author(s):  
P. A. Binding ◽  
H. Volkmer

An eigenvalue problem for k Sturm–Liouville equations coupled by k parameters λ1,…,λk is considered. In contrast to the standard case, for each r, the second-order derivative in the rth equation is multiplied by λr. This problem presents various interesting features. For example, the existence of eigenvalues with oscillation counts beyond a certain (computable) value is obtained without any of the restrictive definiteness conditions known from the standard case. Uniqueness is also analysed, and the results are given greater precision via eigencurve methods in the case of two equations coupled by two parameters.


2016 ◽  
Vol 43 (1) ◽  
pp. 1-19 ◽  
Author(s):  
C. Kristopher Garrett ◽  
Zhaojun Bai ◽  
Ren-Cang Li

2011 ◽  
Vol 1 (2) ◽  
pp. 187-196
Author(s):  
Takafumi Miyata ◽  
Yusaku Yamamoto ◽  
Takashi Uneyama ◽  
Yoshimasa Nakamura ◽  
Shao-Liang Zhang

AbstractThe multishift QR algorithm is efficient for computing all the eigenvalues of a dense, large-scale, non-Hermitian matrix. The major part of this algorithm can be performed by matrix-matrix multiplications and is therefore suitable for modern processors with hierarchical memory. A variant of this algorithm was recently proposed which can execute more computational parts by matrix-matrix multiplications. The algorithm is especially appropriate for recent coprocessors which contain many processor-elements such as the CSX600. However, the performance of the algorithm highly depends on the setting of parameters such as the numbers of shifts and divisions in the algorithm. Optimal settings are different depending on the matrix size and computational environments. In this paper, we construct a performance model to predict a setting of parameters which minimizes the execution time of the algorithm. Experimental results with the CSX600 coprocessor show that our model can be used to find the optimal setting.


1958 ◽  
Vol 3 (12) ◽  
pp. 364-365
Author(s):  
MARTIN T. ORNE
Keyword(s):  

Sign in / Sign up

Export Citation Format

Share Document