A lightweight communication interface for parallel programming environments

The Bridge from Present (Sequential) Systems to Future (Parallel) Systems: The Parallel Programming Environments Express and CSTools

Scientific Computing on Supercomputers II ◽

10.1007/978-1-4613-0659-7_9 ◽

1990 ◽

pp. 175-194

Author(s):

Patrick Van Renterghem

Keyword(s):

Parallel Programming ◽

Parallel Systems ◽

Programming Environments ◽

Sequential Systems

Download Full-text

A Grid Middleware for Aggregating Scientific Computing Libraries and Parallel Programming Environments

Advanced Web Technologies and Applications - Lecture Notes in Computer Science ◽

10.1007/978-3-540-24655-8_73 ◽

2004 ◽

pp. 671-676

Author(s):

Xiaolin Gui ◽

Qingjiang Wang ◽

Depei Qian

Keyword(s):

Parallel Programming ◽

Scientific Computing ◽

Programming Environments ◽

Grid Middleware

Download Full-text

Run-Time and Compiler Support for Programming in Adaptive Parallel Environments

Scientific Programming ◽

10.1155/1997/926796 ◽

1997 ◽

Vol 6 (2) ◽

pp. 215-227 ◽

Cited By ~ 11

Author(s):

Guy Edjlali ◽

Gagan Guyagrawal ◽

Alan Sussman ◽

Jim Humphries ◽

Joel Saltz

Keyword(s):

Parallel Programming ◽

High Performance ◽

Navier Stokes ◽

Programming Environments ◽

Data Parallel ◽

Adaptive Environment ◽

Run Time ◽

Time Required ◽

The Cost ◽

Performance Results

For better utilization of computing resources, it is important to consider parallel programming environments in which the number of available processors varies at run-time. In this article, we discuss run-time support for data-parallel programming in such an adaptive environment. Executing programs in an adaptive environment requires redistributing data when the number of processors changes, and also requires determining new loop bounds and communication patterns for the new set of processors. We have developed a run-time library to provide this support. We discuss how the run-time library can be used by compilers of high-performance Fortran (HPF)-like languages to generate code for an adaptive environment. We present performance results for a Navier-Stokes solver and a multigrid template run on a network of workstations and an IBM SP-2. Our experiments show that if the number of processors is not varied frequently, the cost of data redistribution is not significant compared to the time required for the actual computation. Overall, our work establishes the feasibility of compiling HPF for a network of nondedicated workstations, which are likely to be an important resource for parallel programming in the future.

Download Full-text

Native and generic parallel programming environments on a transputer and a PowerPC platform

Concurrency Practice and Experience ◽

10.1002/(sici)1096-9128(199601)8:1<19::aid-cpe193>3.0.co;2-9 ◽

1996 ◽

Vol 8 (1) ◽

pp. 19-46 ◽

Cited By ~ 2

Author(s):

A.G. Hoekstra ◽

P.M.A. Sloot ◽

F. van der Linden ◽

M. van Muiswinkel ◽

J.J.J. Vesseur ◽

...

Keyword(s):

Parallel Programming ◽

Programming Environments

Download Full-text

Performance Comparison of Parallel Programming Environments for Implementing AIAC Algorithms

The Journal of Supercomputing ◽

10.1007/s11227-006-4667-8 ◽

2006 ◽

Vol 35 (3) ◽

pp. 227-244 ◽

Cited By ~ 13

Author(s):

Jacques M. Bahi ◽

Sylvain Contassot-Vivier ◽

Raphaël Couturier

Keyword(s):

Parallel Programming ◽

Performance Comparison ◽

Programming Environments

Download Full-text

Design issues in building Web-based parallel programming environments

Proceedings. The Sixth IEEE International Symposium on High Performance Distributed Computing (Cat. No.97TB100183) ◽

10.1109/hpdc.1997.626432 ◽

2002 ◽

Cited By ~ 3

Author(s):

K. Dincer ◽

G.C. Fox

Keyword(s):

Parallel Programming ◽

Design Issues ◽

Programming Environments ◽

Web Based

Download Full-text

On parallel programming environments and multilevel optimization

System Modelling and Optimization - Lecture Notes in Control and Information Sciences ◽

10.1007/bfb0008358 ◽

2005 ◽

pp. 84-93

Author(s):

Celso P. Bottura ◽

José T. Costa Filho

Keyword(s):

Parallel Programming ◽

Multilevel Optimization ◽

Programming Environments

Download Full-text

TACO: A Scheduling Scheme for Parallel Applications on Multicore Architectures

Scientific Programming ◽

10.1155/2014/423084 ◽

2014 ◽

Vol 22 (3) ◽

pp. 223-237

Author(s):

Jan H. Schönherr ◽

Ben Juurlink ◽

Jan Richling

Keyword(s):

Parallel Programming ◽

Handheld Computers ◽

Parallel Applications ◽

Multicore Architectures ◽

Programming Environments ◽

Scheduling Scheme ◽

Shared Caches ◽

Server Systems ◽

Individual Concepts ◽

The Individual

While multicore architectures are used in the whole product range from server systems to handheld computers, the deployed software still undergoes the slow transition from sequential to parallel. This transition, however, is gaining more and more momentum due to the increased availability of more sophisticated parallel programming environments. Combined with the ever increasing complexity of multicore architectures, this results in a scheduling problem that is different from what it has been, because concurrently executing parallel programs and features such as non-uniform memory access, shared caches, or simultaneous multithreading have to be considered. In this paper, we compare different ways of scheduling multiple parallel applications on multicore architectures. Due to emerging parallel programming environments, we primarily consider applications where the parallelism degree can be changed on the fly. We propose TACO, a topology-aware scheduling scheme that combines equipartitioning and coscheduling, which does not suffer from the drawbacks of the individual concepts. Additionally, TACO is conceptually compatible with contention-aware scheduling strategies. We find that topology-awareness increases performance for all evaluated workloads. The combination with coscheduling is more sensitive towards the executed workloads and NUMA effects. However, the gained versatility allows new use cases to be explored, which were not possible before.

Download Full-text

EFFICIENT SUPPORT FOR SKELETONS ON WORKSTATION CLUSTERS

Parallel Processing Letters ◽

10.1142/s0129626401000415 ◽

2001 ◽

Vol 11 (01) ◽

pp. 41-56 ◽

Cited By ~ 21

Author(s):

MARCO DANELUTTO

Keyword(s):

Parallel Programming ◽

Message Passing ◽

Low Cost ◽

Implementation Strategies ◽

Parallel Applications ◽

Programming Models ◽

Workstation Clusters ◽

Programming Environments ◽

Implementation Techniques ◽

Reasonable Cost

Beowulf class clusters are gaining more and more interest as low cost parallel architectures. They deliver reasonable performance at a very reasonable cost, compared to classical MPP machines. Parallel applications are usually developed on clusters using MPI/PVM message passing or HPF programming environments. Here we discuss new implementation strategies to support structured parallel programming environments for clusters based on skeletons. The adoption of structured parallel programming models greatly reduces the time spent in developing new parallel applications on clusters. The adoption of our implementation techniques based on macro data flow allows very efficient parallel applications to be developed on clusters. We discuss experiments that demonstrate the full feasibility of the approach.

Download Full-text

Research of Parallel Processing Technology Based on Multi-Core

Applied Mechanics and Materials ◽

10.4028/www.scientific.net/amm.182-183.639 ◽

2012 ◽

Vol 182-183 ◽

pp. 639-643 ◽

Cited By ~ 1

Author(s):

Xiang Li ◽

Fei Li ◽

Chang Hao Wang

Keyword(s):

Parallel Processing ◽

Programming Languages ◽

Parallel Programming ◽

Transactional Memory ◽

Processing Technology ◽

Sequential Process ◽

Programming Environments ◽

Sequential Program ◽

Parallel Code ◽

Thread Level Speculation

In this paper, five kinds of typical multi-core processers are compared from thread, cache, inter-core interconnect and etc. Two kinds of multi-core programming environments and some new programming languages are introduced. Thread-level speculation (TLS) and transactional memory (TM) are introduced to solve the problem of parallelization of sequential program. TLS automatically analyze and speculate the part of sequential process which can be parallel implement, and then automatically generate parallel code. TM systems provide an efficient and easy mechanism for parallel programming on multi-core processors. Typical TM likes TCC, UTM, LogTM, LogTM-SE and SigTM are introduced. Combined the TLS and TM can more effectively improve the sequential program running on the multi-core processors. Typical extended TM systems to support TLS likes TCC, TTM, PTT and STMlite are introduced.

Download Full-text