High-performance parallel implementations of flow accumulation algorithms for multicore architectures

2021 ◽  
Vol 151 ◽  
pp. 104741
Author(s):  
Bartłomiej Kotyra ◽  
Łukasz Chabudziński ◽  
Przemysław Stpiczyński
2019 ◽  
Vol 29 (2) ◽  
pp. 407-419
Author(s):  
Beata Bylina ◽  
Jarosław Bylina

Abstract The aim of this paper is to investigate dense linear algebra algorithms on shared memory multicore architectures. The design and implementation of a parallel tiled WZ factorization algorithm which can fully exploit such architectures are presented. Three parallel implementations of the algorithm are studied. The first one relies only on exploiting multithreaded BLAS (basic linear algebra subprograms) operations. The second implementation, except for BLAS operations, employs the OpenMP standard to use the loop-level parallelism. The third implementation, except for BLAS operations, employs the OpenMP task directive with the depend clause. We report the computational performance and the speedup of the parallel tiled WZ factorization algorithm on shared memory multicore architectures for dense square diagonally dominant matrices. Then we compare our parallel implementations with the respective LU factorization from a vendor implemented LAPACK library. We also analyze the numerical accuracy. Two of our implementations can be achieved with near maximal theoretical speedup implied by Amdahl’s law.


2018 ◽  
Author(s):  
Richard Barnes

To answer geomorphological questions at unprecedented spatial and temporal scales, we need to (a) parse terabyte-scale datasets (DEMs), (b) perform millions of model realizations to pinpoint the parameters which govern landscape evolution, and (c) do so with statistical rigor, which may require thousands of additional realizations. A core set of operations underpin many geomorphic models. These include determination of terrain attributes such as slope and curvature; flow routing; depression flooding and breaching; flat resolution; and flow accumulation. Here, I present RichDEM, a high-performance C++ library and set of wrappers for performing these operations. The library incorporates a number of options for performing each operation and makes full use of modern high-performance capabilities. The library can scale to process DEMs of over one trillion cells and operates effectively on laptops or supercomputers.


1995 ◽  
Vol 34 (2) ◽  
pp. 263-272 ◽  
Author(s):  
R. C. Agarwal ◽  
B. Alpern ◽  
L. Carter ◽  
F. G. Gustavson ◽  
D. J. Klepacki ◽  
...  

Sign in / Sign up

Export Citation Format

Share Document