Implementation of unstructured grid GMRES+LU-SGS method on shared-memory, cache-based parallel computers

Author(s):  
Dmitri Sharov ◽  
Hong Luo ◽  
Joseph Baum ◽  
Rainald Loehner
2021 ◽  
Vol 26 ◽  
pp. 1-67
Author(s):  
Patrick Dinklage ◽  
Jonas Ellert ◽  
Johannes Fischer ◽  
Florian Kurpicz ◽  
Marvin Löbel

We present new sequential and parallel algorithms for wavelet tree construction based on a new bottom-up technique. This technique makes use of the structure of the wavelet trees—refining the characters represented in a node of the tree with increasing depth—in an opposite way, by first computing the leaves (most refined), and then propagating this information upwards to the root of the tree. We first describe new sequential algorithms, both in RAM and external memory. Based on these results, we adapt these algorithms to parallel computers, where we address both shared memory and distributed memory settings. In practice, all our algorithms outperform previous ones in both time and memory efficiency, because we can compute all auxiliary information solely based on the information we obtained from computing the leaves. Most of our algorithms are also adapted to the wavelet matrix , a variant that is particularly suited for large alphabets.


2005 ◽  
Vol 18 (2) ◽  
pp. 219-224
Author(s):  
Emina Milovanovic ◽  
Natalija Stojanovic

Because many universities lack the funds to purchase expensive parallel computers, cost effective alternatives are needed to teach students about parallel processing. Free software is available to support the three major paradigms of parallel computing. Parallaxis is a sophisticated SIMD simulator which runs on a variety of platforms.jBACI shared memory simulator supports the MIMD model of computing with a common shared memory. PVM and MPI allow students to treat a network of workstations as a message passing MIMD multicomputer with distributed memory. Each of this software tools can be used in a variety of courses to give students experience with parallel algorithms.


2003 ◽  
Vol 11 (2) ◽  
pp. 105-124 ◽  
Author(s):  
Vishal Aslot ◽  
Rudolf Eigenmann

The state of modern computer systems has evolved to allow easy access to multiprocessor systems by supporting multiple processors on a single physical package. As the multiprocessor hardware evolves, new ways of programming it are also developed. Some inventions may merely be adopting and standardizing the older paradigms. One such evolving standard for programming shared-memory parallel computers is the OpenMP API. The Standard Performance Evaluation Corporation (SPEC) has created a suite of parallel programs called SPEC OMP to compare and evaluate modern shared-memory multiprocessor systems using the OpenMP standard. We have studied these benchmarks in detail to understand their performance on a modern architecture. In this paper, we present detailed measurements of the benchmarks. We organize, summarize, and display our measurements using a Quantitative Model. We present a detailed discussion and derivation of the model. Also, we discuss the important loops in the SPEC OMPM2001 benchmarks and the reasons for less than ideal speedup on our platform.


Sign in / Sign up

Export Citation Format

Share Document