A shared virtual memory network with fast remote direct memory access and message passing

Existing user-level network interfaces deliver high bandwidth, low latency performance to applications, but are typically unable to support diverse styles of communication and are unsuitable for use in multiprogrammed environments. Often this is because the network abstraction is presented at too high a level, and support for synchronisation is inflexible. In this paper we present a new primitive for in-band synchronisation: the Tripwire. Tripwires provide a flexible, efficient and scalable means for synchronisation that is orthogonal to data transfer. We describe the implementation of a non-coherent distributed shared memory network interface, with Tripwires for synchronisation. This interface provides a low-level communications model with gigabit class bandwidth and very low overhead and latency. We show how it supports a variety of communication styles, including remote procedure call, message passing and streaming.

Download Full-text

A comparison of shared virtual memory and message passing programming techniques based on a finite element application

Parallel Processing: CONPAR 94 — VAPP VI - Lecture Notes in Computer Science ◽

10.1007/3-540-58430-7_41 ◽

1994 ◽

pp. 461-472

Author(s):

Rudolf Berrendorf ◽

Michael Gerndt ◽

Zakaria Lahjomri ◽

Thierry Priol

Keyword(s):

Finite Element ◽

Message Passing ◽

Virtual Memory ◽

Programming Techniques ◽

Shared Virtual Memory

Download Full-text

FRAMP: A fast remote direct memory access and message passing network

IEEE International Symposium on Communications and Information Technology, 2004. ISCIT 2004. ◽

10.1109/iscit.2004.1412914 ◽

2005 ◽

Author(s):

Gang Shi ◽

Mingehang Hu ◽

Hongda Yin ◽

Weiwu Hu ◽

Zhimin Tang

Keyword(s):

Message Passing ◽

Direct Memory Access ◽

Memory Access

Download Full-text

Enabling Highly-Scalable Remote Memory Access Programming with MPI-3 One Sided

Scientific Programming ◽

10.1155/2014/571902 ◽

2014 ◽

Vol 22 (2) ◽

pp. 75-91 ◽

Cited By ~ 12

Author(s):

Robert Gerstenberger ◽

Maciej Besta ◽

Torsten Hoefler

Keyword(s):

Message Passing ◽

Direct Memory Access ◽

Memory Access ◽

Remote Memory ◽

Memory Consumption ◽

Performance Models ◽

Application Performance ◽

Performance Improvements ◽

Programming Interface ◽

Better Than

Modern interconnects offer remote direct memory access (RDMA) features. Yet, most applications rely on explicit message passing for communications albeit their unwanted overheads. The MPI-3.0 standard defines a programming interface for exploiting RDMA networks directly, however, it's scalability and practicability has to be demonstrated in practice. In this work, we develop scalable bufferless protocols that implement the MPI-3.0 specification. Our protocols support scaling to millions of cores with negligible memory consumption while providing highest performance and minimal overheads. To arm programmers, we provide a spectrum of performance models for all critical functions and demonstrate the usability of our library and models with several application studies with up to half a million processes. We show that our design is comparable to, or better than UPC and Fortran Coarrays in terms of latency, bandwidth and message rate. We also demonstrate application performance improvements with comparable programming complexity.

Download Full-text