Parallel algorithm for the matrix chain product and the optimal triangulation problems (extended abstract)

Author(s):  
Artur Czumaj
1995 ◽  
Vol 05 (02) ◽  
pp. 263-274 ◽  
Author(s):  
MARK A. STALZER

Presented is a parallel algorithm based on the fast multipole method (FMM) for the Helmholtz equation. This variant of the FMM is useful for computing radar cross sections and antenna radiation patterns. The FMM decomposes the impedance matrix into sparse components, reducing the operation count of the matrix-vector multiplication in iterative solvers to O(N3/2) (where N is the number of unknowns). The parallel algorithm divides the problem into groups and assigns the computation involved with each group to a processor node. Careful consideration is given to the communications costs. A time complexity analysis of the algorithm is presented and compared with empirical results from a Paragon XP/S running the lightweight Sandia/University of New Mexico operating system (SUNMOS). For a 90,000 unknown problem running on 60 nodes, the sparse representation fits in memory and the algorithm computes the matrix-vector product in 1.26 seconds. It sustains an aggregate rate of 1.4 Gflop/s. The corresponding dense matrix would occupy over 100 Gbytes and, assuming that I/O is free, would require on the order of 50 seconds to form the matrix-vector product.


1990 ◽  
Vol 02 (02) ◽  
pp. 181-191 ◽  
Author(s):  
PRADEEP PANDEY ◽  
CHARLES KENNEY ◽  
ALAN J. LAUB

1995 ◽  
Vol 05 (03) ◽  
pp. 257-271 ◽  
Author(s):  
MIKHAIL J. ATALLAH ◽  
DANNY Z. CHEN

Many problems on sequences and on special kinds of graphs involve the computation of longest chains passing points in the plane. Given a set S of n points in the plane, we consider the problem of computing the matrix of longest chain lengths between all pairs of points in S, and the matrix of “parent” pointers that describes the n longest chain trees. We present a simple sequential algorithm for computing these matrices. Our algorithm runs in O(n2) time, and hence is optimal. We also present a rather involved parallel algorithm that computes these matrices in O((log n)2) time using O(n2/log n) processors in the CREW PRAM model. These matrices enable us to report, in O(1) time, the length of a longest chain between any two points in S by using one processor, and the actual chain by using k processors, where k is the number of points of S on that chain. The space complexity of the algorithms is O(n2).


Author(s):  
А.К. Новиков ◽  
C.П. Копысов ◽  
Н.С. Недожогин

Исследуются возможности ускорения предобусловленных методов бисопряженных градиентов (BiCGStab, Bi-Conjugate Gradient Stabilized) с предобусловливателем на основе аппроксимации обращения матрицы по формуле Шермана-Моррисона. Рассмотрена новая форма параллельного алгоритма, использующая матрично-векторные произведения при формирования матриц предобусловливателя. Показана эффективность распараллеливания наиболее ресурсоемких операций этого предобусловливателя на графических процессорах. Acceleration of preconditioned bi-conjugate gradient stabilized (BiCGStab) methods with preconditioners based on the matrix approximation by the Sherman-Morrison inversion formula is studied. A new form of the parallel algorithm using matrix-vector products to generate preconditioning matrices is proposed. A parallelization efficiency of the most resource-intensive operations of such preconditioners on multi-core central and graphics processing units (CPUs and GPUs) is shown.


Author(s):  
С.А. Харченко

Рассматривается параллельный алгоритм вычисления разреженного $QR$-разложения специальным образом упорядоченной прямоугольной матрицы на основе разреженных блочных преобразований Хаусхолдера. Для построения необходимого упорядочивания можно использовать столбцевое упорядочивание типа вложенных сечений, построенное по структуре матрицы $A^{T}A$, где $A$ - исходная прямоугольная матрица. Для сеточных задач упорядочивание может быть построено на основе известного объемного разбиения расчетной сетки. В качестве базового алгоритма для организации параллельных вычислений используется $QR$-разложение для наборов строк матрицы с дополнением в виде нулевого начального блока. An algorithm for computing the sparse $QR$ decomposition of a specially ordered rectangular matrix is proposed. This decomposition is based on the block sparse Householder transformations. For ordering computations, the nested dissection ordering is used for the matrix $A^{T}A$, where $A$ is the original rectangular matrix. For mesh based problems, the ordering can be constructed starting from an appropriate volume partitioning of the computational mesh. Parallel computations are based on sparse $QR$ decomposition for sets of rows with an additional initial zero block.


Algorithms ◽  
2021 ◽  
Vol 14 (11) ◽  
pp. 309
Author(s):  
Aleksandr Cariow ◽  
Janusz P. Paplinski

The article presents a parallel hardware-oriented algorithm designed to speed up the division of two octonions. The advantage of the proposed algorithm is that the number of real multiplications is halved as compared to the naive method for implementing this operation. In the synthesis of the discussed algorithm, the matrix representation of this operation was used, which allows us to present the division of octonions by means of a vector–matrix product. Taking into account a specific structure of the matrix multiplicand allows for reducing the number of real multiplications necessary for the execution of the octonion division procedure.


Author(s):  
Odell T. Minick ◽  
Hidejiro Yokoo

Mitochondrial alterations were studied in 25 liver biopsies from patients with alcoholic liver disease. Of special interest were the morphologic resemblance of certain fine structural variations in mitochondria and crystalloid inclusions. Four types of alterations within mitochondria were found that seemed to relate to cytoplasmic crystalloids.Type 1 alteration consisted of localized groups of cristae, usually oriented in the long direction of the organelle (Fig. 1A). In this plane they appeared serrated at the periphery with blind endings in the matrix. Other sections revealed a system of equally-spaced diagonal lines lengthwise in the mitochondrion with cristae protruding from both ends (Fig. 1B). Profiles of this inclusion were not unlike tangential cuts of a crystalloid structure frequently seen in enlarged mitochondria described below.


Author(s):  
R. A. Ricks ◽  
Angus J. Porter

During a recent investigation concerning the growth of γ' precipitates in nickel-base superalloys it was observed that the sign of the lattice mismatch between the coherent particles and the matrix (γ) was important in determining the ease with which matrix dislocations could be incorporated into the interface to relieve coherency strains. Thus alloys with a negative misfit (ie. the γ' lattice parameter was smaller than the matrix) could lose coherency easily and γ/γ' interfaces would exhibit regularly spaced networks of dislocations, as shown in figure 1 for the case of Nimonic 115 (misfit = -0.15%). In contrast, γ' particles in alloys with a positive misfit could grow to a large size and not show any such dislocation arrangements in the interface, thus indicating that coherency had not been lost. Figure 2 depicts a large γ' precipitate in Nimonic 80A (misfit = +0.32%) showing few interfacial dislocations.


Author(s):  
S. Mahajan ◽  
M. R. Pinnel ◽  
J. E. Bennett

The microstructural changes in an Fe-Co-V alloy (composition by wt.%: 2.97 V, 48.70 Co, 47.34 Fe and balance impurities, such as C, P and Ni) resulting from different heat treatments have been evaluated by optical metallography and transmission electron microscopy. Results indicate that, on air cooling or quenching into iced-brine from the high temperature single phase ϒ (fcc) field, vanadium can be retained in a supersaturated solid solution (α2) which has bcc structure. For the range of cooling rates employed, a portion of the material appears to undergo the γ-α2 transformation massively and the remainder martensitically. Figure 1 shows dislocation topology in a region that may have transformed martensitically. Dislocations are homogeneously distributed throughout the matrix, and there is no evidence for cell formation. The majority of the dislocations project along the projections of <111> vectors onto the (111) plane, implying that they are predominantly of screw character.


Sign in / Sign up

Export Citation Format

Share Document