A lightweight communication interface for parallel programming environments

Author(s):  
Matthias Brune ◽  
Jörn Gehring ◽  
Alexander Reinefeld
1997 ◽  
Vol 6 (2) ◽  
pp. 215-227 ◽  
Author(s):  
Guy Edjlali ◽  
Gagan Guyagrawal ◽  
Alan Sussman ◽  
Jim Humphries ◽  
Joel Saltz

For better utilization of computing resources, it is important to consider parallel programming environments in which the number of available processors varies at run-time. In this article, we discuss run-time support for data-parallel programming in such an adaptive environment. Executing programs in an adaptive environment requires redistributing data when the number of processors changes, and also requires determining new loop bounds and communication patterns for the new set of processors. We have developed a run-time library to provide this support. We discuss how the run-time library can be used by compilers of high-performance Fortran (HPF)-like languages to generate code for an adaptive environment. We present performance results for a Navier-Stokes solver and a multigrid template run on a network of workstations and an IBM SP-2. Our experiments show that if the number of processors is not varied frequently, the cost of data redistribution is not significant compared to the time required for the actual computation. Overall, our work establishes the feasibility of compiling HPF for a network of nondedicated workstations, which are likely to be an important resource for parallel programming in the future.


Author(s):  
A.G. Hoekstra ◽  
P.M.A. Sloot ◽  
F. van der Linden ◽  
M. van Muiswinkel ◽  
J.J.J. Vesseur ◽  
...  

2006 ◽  
Vol 35 (3) ◽  
pp. 227-244 ◽  
Author(s):  
Jacques M. Bahi ◽  
Sylvain Contassot-Vivier ◽  
Raphaël Couturier

2014 ◽  
Vol 22 (3) ◽  
pp. 223-237
Author(s):  
Jan H. Schönherr ◽  
Ben Juurlink ◽  
Jan Richling

While multicore architectures are used in the whole product range from server systems to handheld computers, the deployed software still undergoes the slow transition from sequential to parallel. This transition, however, is gaining more and more momentum due to the increased availability of more sophisticated parallel programming environments. Combined with the ever increasing complexity of multicore architectures, this results in a scheduling problem that is different from what it has been, because concurrently executing parallel programs and features such as non-uniform memory access, shared caches, or simultaneous multithreading have to be considered. In this paper, we compare different ways of scheduling multiple parallel applications on multicore architectures. Due to emerging parallel programming environments, we primarily consider applications where the parallelism degree can be changed on the fly. We propose TACO, a topology-aware scheduling scheme that combines equipartitioning and coscheduling, which does not suffer from the drawbacks of the individual concepts. Additionally, TACO is conceptually compatible with contention-aware scheduling strategies. We find that topology-awareness increases performance for all evaluated workloads. The combination with coscheduling is more sensitive towards the executed workloads and NUMA effects. However, the gained versatility allows new use cases to be explored, which were not possible before.


2001 ◽  
Vol 11 (01) ◽  
pp. 41-56 ◽  
Author(s):  
MARCO DANELUTTO

Beowulf class clusters are gaining more and more interest as low cost parallel architectures. They deliver reasonable performance at a very reasonable cost, compared to classical MPP machines. Parallel applications are usually developed on clusters using MPI/PVM message passing or HPF programming environments. Here we discuss new implementation strategies to support structured parallel programming environments for clusters based on skeletons. The adoption of structured parallel programming models greatly reduces the time spent in developing new parallel applications on clusters. The adoption of our implementation techniques based on macro data flow allows very efficient parallel applications to be developed on clusters. We discuss experiments that demonstrate the full feasibility of the approach.


2012 ◽  
Vol 182-183 ◽  
pp. 639-643 ◽  
Author(s):  
Xiang Li ◽  
Fei Li ◽  
Chang Hao Wang

In this paper, five kinds of typical multi-core processers are compared from thread, cache, inter-core interconnect and etc. Two kinds of multi-core programming environments and some new programming languages are introduced. Thread-level speculation (TLS) and transactional memory (TM) are introduced to solve the problem of parallelization of sequential program. TLS automatically analyze and speculate the part of sequential process which can be parallel implement, and then automatically generate parallel code. TM systems provide an efficient and easy mechanism for parallel programming on multi-core processors. Typical TM likes TCC, UTM, LogTM, LogTM-SE and SigTM are introduced. Combined the TLS and TM can more effectively improve the sequential program running on the multi-core processors. Typical extended TM systems to support TLS likes TCC, TTM, PTT and STMlite are introduced.


Sign in / Sign up

Export Citation Format

Share Document