Comparing task and data parallel execution schemes for the DIIRK method

Author(s):  
Thomas Rauber ◽  
Gudula Rünger
1999 ◽  
Vol 7 (1) ◽  
pp. 1-19
Author(s):  
Xiaodong Zhang ◽  
Lin Sun

Shared‐memory and data‐parallel programming models are two important paradigms for scientific applications. Both models provide high‐level program abstractions, and simple and uniform views of network structures. The common features of the two models significantly simplify program coding and debugging for scientific applications. However, the underlining execution and overhead patterns are significantly different between the two models due to their programming constraints, and due to different and complex structures of interconnection networks and systems which support the two models. We performed this experimental study to present implications and comparisons of execution patterns on two commercial architectures. We implemented a standard electromagnetic simulation program (EM) and a linear system solver using the shared‐memory model on the KSR‐1 and the data‐parallel model on the CM‐5. Our objectives are to examine the execution pattern changes required for an implementation transformation between the two models; to study memory access patterns; to address scalability issues; and to investigate relative costs and advantages/disadvantages of using the two models for scientific computations. Our results indicate that the EM program tends to become computation‐intensive in the KSR‐1 shared‐memory system, and memory‐demanding in the CM‐5 data‐parallel system when the systems and the problems are scaled. The EM program, a highly data‐parallel program performed extremely well, and the linear system solver, a highly control‐structured program suffered significantly in the data‐parallel model on the CM‐5. Our study provides further evidence that matching execution patterns of algorithms to parallel architectures would achieve better performance.


2008 ◽  
Vol 18 (01) ◽  
pp. 23-37 ◽  
Author(s):  
CLEMENS GRELCK ◽  
STEFFEN KUTHE ◽  
SVEN-BODO SCHOLZ

We propose a novel execution model for the implicitly parallel execution of data parallel programs in the presence of general I/O operations. This model is called hybrid because it combines the advantages of the standard execution models fork/join and SPMD. Based on program analysis the hybrid model adapts itself to one or the other on the granularity of individual instructions. We outline compilation techniques that systematically derive the organization of parallel code from data flow characteristics aiming at the reduction of execution mode switches in general and synchronization/communication requirements in particular. Experiments based on a prototype implementation show the effectiveness of the hybrid execution model for reducing parallel overhead.


1999 ◽  
Vol 9 (4) ◽  
pp. 427-462 ◽  
Author(s):  
SUSUMU NISHIMURA ◽  
ATSUSHI OHORI

This article proposes a new language mechanism for data-parallel processing of dynamically allocated recursively defined data. Different from the conventional array-based data- parallelism, it allows parallel processing of general recursively defined data such as lists or trees in a functional way. This is achieved by representing a recursively defined datum as a system of equations, and defining new language constructs for parallel transformation of a system of equations. By integrating them with a higher-order functional language, we obtain a functional programming language suitable for describing data-parallel algorithms on recursively defined data in a declarative way. The language has an ML style polymorphic type system and a type sound operational semantics that uniformly integrates the parallel evaluation mechanism with the semantics of a typed functional language. We also show the intended parallel execution model behind the formal semantics, assuming an idealized distributed memory multicomputer.


2001 ◽  
Vol 11 (04) ◽  
pp. 423-437 ◽  
Author(s):  
F. LOULERGUE

The BS λp-calculus is a calculus of functional bulk synchronous parallel (BSP) programs. It is the basis for the design of a bulk synchronous parallel ML language. For data-parallel languages, there are two points of view: the programming model where a program is seen as a sequence of operations on parallel vectors, and the execution model where the program is a parallel composition of programs run on each processor of the parallel machine. BSP algorithms are defined by data-parallel algorithms with explicit (physical) processes in order to allow their parallel execution time to be estimated. We present here a distributed evaluation minimally synchronous for BSP execution (which corresponds to the execution model). This distributed evaluation is correct w.r.t. the call-by-value strategy of the BS λp-calculus (which corresponds to the programming model).


2019 ◽  
Vol 65 ◽  
pp. 136-147
Author(s):  
Libo Huang ◽  
Yashuai Lü ◽  
Sheng Ma ◽  
Nong Xiao ◽  
Zhiying Wang

Sign in / Sign up

Export Citation Format

Share Document