Interactively Exploring the Connection between Nested Dissection Orderings for Parallel Cholesky Factorization and Vertex Separators

Author(s):  
H. Martin Bucker ◽  
M. Ali Rostami
2021 ◽  
Author(s):  
Kai Hormann ◽  
Craig Gotsman

We describe a simple and practical algorithm for compact routing on graphs which admit compact and balanced vertex separators. Using a recursive nested dissection of then-vertex graph based on these separators, we construct routing tables with as few as O(log n) entries per vertex in a preprocessing step. They support handshaking-based routing on the graph with moderate stretch, where the handshaking can be implemented similarly to a DNS lookup. We describe a basic version of the algorithm that requires modifiable headers and a more advanced version which eliminates this need and provides better stretch. A number of algorithmic parameters control a graceful tradeoff between the size of the routing tables and the stretch. Our routing algorithm is most effective on planar graphs and unit disk graphs of moderate edge/vertex density.


2013 ◽  
Vol 706-708 ◽  
pp. 1890-1893
Author(s):  
Lu Yao ◽  
Yi Yang ◽  
Zheng Hua Wang ◽  
Wei Cao

Matrix ordering is a key technique when applying Cholesky factorization method to solving sparse symmetric positive definite system Ax = b. Much effort has been devoted to the development of powerful heuristic ordering algorithms. This paper implements a sparse matrix ordering scheme based on hypergraph partitioning. The novel nested dissection ordering scheme achieve the vertex separator by hypergraph partitioning. Experimental results show that the novel scheme produces results that are substantially better than METIS.


2021 ◽  
Author(s):  
Kai Hormann ◽  
Craig Gotsman

We describe a simple and practical algorithm for compact routing on graphs which admit compact and balanced vertex separators. Using a recursive nested dissection of then-vertex graph based on these separators, we construct routing tables with as few as O(log n) entries per vertex in a preprocessing step. They support handshaking-based routing on the graph with moderate stretch, where the handshaking can be implemented similarly to a DNS lookup. We describe a basic version of the algorithm that requires modifiable headers and a more advanced version which eliminates this need and provides better stretch. A number of algorithmic parameters control a graceful tradeoff between the size of the routing tables and the stretch. Our routing algorithm is most effective on planar graphs and unit disk graphs of moderate edge/vertex density.


2021 ◽  
Vol 402 ◽  
pp. 126037
Author(s):  
Li Chen ◽  
Shuisheng Zhou ◽  
Jiajun Ma ◽  
Mingliang Xu

Author(s):  
Jack Poulson

Determinantal point processes (DPPs) were introduced by Macchi (Macchi 1975 Adv. Appl. Probab. 7 , 83–122) as a model for repulsive (fermionic) particle distributions. But their recent popularization is largely due to their usefulness for encouraging diversity in the final stage of a recommender system (Kulesza & Taskar 2012 Found. Trends Mach. Learn. 5 , 123–286). The standard sampling scheme for finite DPPs is a spectral decomposition followed by an equivalent of a randomly diagonally pivoted Cholesky factorization of an orthogonal projection, which is only applicable to Hermitian kernels and has an expensive set-up cost. Researchers Launay et al. 2018 ( http://arxiv.org/abs/1802.08429 ); Chen & Zhang 2018 NeurIPS ( https://papers.nips.cc/paper/7805-fast-greedy-map-inference-for-determinantal-point-process-to-improve-recommendation-diversity.pdf ) have begun to connect DPP sampling to LDL H factorizations as a means of avoiding the initial spectral decomposition, but existing approaches have only outperformed the spectral decomposition approach in special circumstances, where the number of kept modes is a small percentage of the ground set size. This article proves that trivial modifications of LU and LDL H factorizations yield efficient direct sampling schemes for non-Hermitian and Hermitian DPP kernels, respectively. Furthermore, it is experimentally shown that even dynamically scheduled, shared-memory parallelizations of high-performance dense and sparse-direct factorizations can be trivially modified to yield DPP sampling schemes with essentially identical performance. The software developed as part of this research, Catamari ( hodgestar.com/catamari ) is released under the Mozilla Public License v.2.0. It contains header-only, C++14 plus OpenMP 4.0 implementations of dense and sparse-direct, Hermitian and non-Hermitian DPP samplers. This article is part of a discussion meeting issue ‘Numerical algorithms for high-performance computational science’.


1994 ◽  
Vol 02 (04) ◽  
pp. 371-422 ◽  
Author(s):  
E. PADOVANI ◽  
E. PRIOLO ◽  
G. SERIANI

The finite element method (FEM) is a numerical technique well suited to solving problems of elastic wave propagation in complex geometries and heterogeneous media. The main advantages are that very irregular grids can be used, free surface boundary conditions can be easily taken into account, a good reconstruction is possible of irregular surface topography, and complex geometries, such as curved, dipping and rough interfaces, intrusions, cusps, and holes can be defined. The main drawbacks of the classical approach are the need for a large amount of memory, low computational efficiency, and the possible appearance of spurious effects. In this paper we describe some experience in improving the computational efficiency of a finite element code based on a global approach, and used for seismic modeling in geophysical oil exploration. Results from the use of different methods and models run on a mini-superworkstation APOLLO DN10000 are reported and compared. With Chebyshev spectral elements, great accuracy can be reached with almost no numerical artifacts. Static condensation of the spectral element's internal nodes dramatically reduces memory requirements and CPU time. Time integration performed with the classical implicit Newmark scheme is very accurate but not very efficient. Due to the high sparsity of the matrices, the use of compressed storage is shown to greatly reduce not only memory requirements but also computing time. The operation which most affects the performance is the matrix-by-vector product; an effective programming of this subroutine for the storage technique used is decisive. The conjugate gradient method preconditioned by incomplete Cholesky factorization provides, in general, a good compromise between efficiency and memory requirements. Spectral elements greatly increase its efficiency, since the number of iterations is reduced. The most efficient and accurate method is a hybrid iterative-direct solution of the linear system arising from the static condensation of high order elements. The size of 2D models that can be handled in a reasonable time on this kind of computer is nowadays hardly sufficient, and significant 3D modeling is completely unfeasible. However the introduction of new FEM algorithms coupled with the use of new computer architectures is encouraging for the future.


1978 ◽  
Vol 15 (4) ◽  
pp. 662-673 ◽  
Author(s):  
Alan George ◽  
William G. Poole, Jr. ◽  
Robert G. Voigt
Keyword(s):  

Author(s):  
Erfan Bank Tavakoli ◽  
Michael Riera ◽  
Masudul Hassan Quraishi ◽  
Fengbo Ren

Sign in / Sign up

Export Citation Format

Share Document