Implementation and Performance of DSMPI

Luis M. Silva; JoÃo Gabriel Silva; Simon Chapple

doi:10.1155/1997/452521

Implementation and Performance of DSMPI

Scientific Programming ◽

10.1155/1997/452521 ◽

1997 ◽

Vol 6 (2) ◽

pp. 201-214 ◽

Cited By ~ 2

Author(s):

Luis M. Silva ◽

JoÃo Gabriel Silva ◽

Simon Chapple

Keyword(s):

Shared Memory ◽

Message Passing ◽

Distributed Memory ◽

Programming Model ◽

Distributed Shared Memory ◽

Memory Systems ◽

Distributed Memory Machines ◽

Coherence Protocols ◽

And Performance ◽

Performance Results

Distributed shared memory has been recognized as an alternative programming model to exploit the parallelism in distributed memory systems because it provides a higher level of abstraction than simple message passing. DSM combines the simple programming model of shared memory with the scalability of distributed memory machines. This article presents DSMPI, a parallel library that runs atop of MPI and provides a DSM abstraction. It provides an easy-to-use programming interface, is fully, portable, and supports heterogeneity. For the sake of flexibility, it supports different coherence protocols and models of consistency. We present some performance results taken in a network of workstations and in a Cray T3D which show that DSMPI can be competitive with MPI for some applications.

Get full-text (via PubEx)

MIMD, Multiple Instruction, Multiple Data

Introduction to Parallel Computing ◽

10.1093/oso/9780198515760.003.0010 ◽

2004 ◽

Author(s):

Wesley Petersen ◽

Peter Arbenz

Keyword(s):

Shared Memory ◽

Message Passing ◽

Distributed Memory ◽

Programming Model ◽

Data Access ◽

File Server ◽

Distributed Memory Machines ◽

Shared Data ◽

Multiple Data ◽

Common Memory

The Multiple instruction, multiple data (MIMD) programming model usually refers to computing on distributed memory machines with multiple independent processors. Although processors may run independent instruction streams, we are interested in streams that are always portions of a single program. Between processors which share a coherent memory view (within a node), data access is immediate, whereas between nodes data access is effected by message passing. In this book, we use MPI for such message passing. MPI has emerged as a more/less standard message passing system used on both shared memory and distributed memory machines. It is often the case that although the system consists of multiple independent instruction streams, the programming model is not too different from SIMD. Namely, the totality of a program is logically split into many independent tasks each processed by a group (see Appendix D) of processes—but the overall program is effectively single threaded at the beginning, and likewise at the end. The MIMD model, however, is extremely flexible in that no one process is always master and the other processes slaves. A communicator group of processes performs certain tasks, usually with an arbitrary master/slave relationship. One process may be assigned to be master (or root) and coordinates the tasks of others in the group. We emphasize that the assignments of which is root is arbitrary—any processor may be chosen. Frequently, however, this choice is one of convenience—a file server node, for example. Processors and memory are connected by a network, for example, Figure 5.1. In this form, each processor has its own local memory. This is not always the case: The Cray X1, and NEC SX-6 through SX-8 series machines, have common memory within nodes. Within a node, memory coherency is maintained within local caches. Between nodes, it remains the programmer’s responsibility to assure a proper read–update relationship in the shared data. Data updated by one set of processes should not be clobbered by another set until the data are properly used.

Get full-text (via PubEx)

Parallel Array Classes and Lightweight Sharing Mechanisms

Scientific Programming ◽

10.1155/1993/393409 ◽

1993 ◽

Vol 2 (4) ◽

pp. 203-216

Author(s):

Steve W. Otto

Keyword(s):

Finite Element Method ◽

Shared Memory ◽

Message Passing ◽

Distributed Memory ◽

Programming Model ◽

Memory Usage ◽

Particle In Cell ◽

Parallel Array ◽

Memory Architectures ◽

Shared Memory Architectures

We discuss a set of parallel array classes, MetaMP, for distributed-memory architectures. The classes are implemented in C++ and interface to the PVM or Intel NX message-passing systems. An array class implements a partitioned array as a set of objects distributed across the nodes – a "collective" object. Object methods hide the low-level message-passing and implement meaningful array operations. These include transparent guard strips (or sharing regions) that support finite-difference stencils, reductions and multibroadcasts for support of pivoting and row operations, and interpolation/contraction operations for support of multigrid algorithms. The concept of guard strips is generalized to an object implementation of lightweight sharing mechanisms for finite element method (FEM) and particle-in-cell (PIC) algorithms. The sharing is accomplished through the mechanism of weak memory coherence and can be efficiently implemented. The price of the efficient implementation is memory usage and the need to explicitly specify the coherence operations. An intriguing feature of this programming model is that it maps well to both distributed-memory and shared-memory architectures.

Get full-text (via PubEx)

Towards a portable hierarchical view of distributed shared memory systems

Proceedings of the Eleventh International Workshop on Programming Models and Applications for Multicores and Manycores ◽

10.1145/3380536.3380542 ◽

2020 ◽

Author(s):

Millad Ghane ◽

Sunita Chandrasekaran ◽

Margaret S. Cheung

Keyword(s):

Shared Memory ◽

Distributed Shared Memory ◽

Memory Systems

Get full-text (via PubEx)

Logging and recovery in adaptive software distributed shared memory systems

Proceedings of the 18th IEEE Symposium on Reliable Distributed Systems ◽

10.1109/reldis.1999.805096 ◽

2003 ◽

Cited By ~ 3

Author(s):

A. Kongmunvattana ◽

Nian-Feng Tzeng

Keyword(s):

Shared Memory ◽

Distributed Shared Memory ◽

Memory Systems ◽

Adaptive Software ◽

Software Distributed Shared Memory

Get full-text (via PubEx)

AN OPTICALLY INTERCONNECTED DISTRIBUTED SHARED MEMORY SYSTEM: ARCHITECTURE AND PERFORMANCE ANALYSIS

International Journal of High Speed Computing ◽

10.1142/s0129053392000080 ◽

1992 ◽

Vol 04 (03) ◽

pp. 179-212 ◽

Cited By ~ 2

Author(s):

KALYANI BOGINENI ◽

PATRICK W. DOWD

Keyword(s):

Performance Analysis ◽

Shared Memory ◽

System Architecture ◽

Distributed Shared Memory ◽

Memory System ◽

Shared Memory System ◽

And Performance

Get full-text (via PubEx)

Towards OpenMP Execution on Software Distributed Shared Memory Systems

Lecture Notes in Computer Science - High Performance Computing ◽

10.1007/3-540-47847-7_42 ◽

2002 ◽

pp. 457-468 ◽

Cited By ~ 15

Author(s):

Ayon Basumallik ◽

Seung-Jai Min ◽

Rudolf Eigenmann

Keyword(s):

Shared Memory ◽

Distributed Shared Memory ◽

Memory Systems ◽

Software Distributed Shared Memory

Get full-text (via PubEx)

Teaching tools for parallel processing

Facta universitatis - series Electronics and Energetics ◽

10.2298/fuee0502219m ◽

2005 ◽

Vol 18 (2) ◽

pp. 219-224

Author(s):

Emina Milovanovic ◽

Natalija Stojanovic

Keyword(s):

Parallel Computing ◽

Parallel Processing ◽

Shared Memory ◽

Message Passing ◽

Distributed Memory ◽

Cost Effective ◽

Parallel Computers ◽

Free Software ◽

Teaching Tools ◽

Network Of Workstations

Because many universities lack the funds to purchase expensive parallel computers, cost effective alternatives are needed to teach students about parallel processing. Free software is available to support the three major paradigms of parallel computing. Parallaxis is a sophisticated SIMD simulator which runs on a variety of platforms.jBACI shared memory simulator supports the MIMD model of computing with a common shared memory. PVM and MPI allow students to treat a network of workstations as a message passing MIMD multicomputer with distributed memory. Each of this software tools can be used in a variety of courses to give students experience with parallel algorithms.

Get full-text (via PubEx)

Cluster-Enabled OpenMP: An OpenMP Compiler for the SCASH Software Distributed Shared Memory System

Scientific Programming ◽

10.1155/2001/605217 ◽

2001 ◽

Vol 9 (2-3) ◽

pp. 123-130 ◽

Cited By ~ 21

Author(s):

Mitsuhisa Sato ◽

Hiroshi Harada ◽

Atsushi Hasegawa ◽

Yutaka Ishikawa

Keyword(s):

Shared Memory ◽

Programming Model ◽

Distributed Shared Memory ◽

Memory System ◽

Data Mapping ◽

Loop Scheduling ◽

Parallel Programming Model ◽

Scheduling Method ◽

Shared Memory System ◽

Software Distributed Shared Memory

OpenMP is attracting wide-spread interest because of its easy-to-use parallel programming model for shared memory multiprocessors. We have implemented a "cluster-enabled" OpenMP compiler for a page-based software distributed shared memory system, SCASH, which works on a cluster of PCs. It allows OpenMP programs to run transparently in a distributed memory environment. The compiler transforms OpenMP programs into parallel programs using SCASH so that shared global variables are allocated at run time in the shared address space of SCASH. A set of directives is added to specify data mapping and loop scheduling method which schedules iterations onto threads associated with the data mapping. Our experimental results show that the data mapping may greatly impact on the performance of OpenMP programs in the software distributed shared memory system. The performance of some NAS parallel benchmark programs in OpenMP is improved by using our extended directives.

Get full-text (via PubEx)