I-DMAC: An Intelligent DMA Controller for Utilization - Aware Video Streaming used in AI Applications

RDMA data transfer and GPU acceleration methods for high-throughput online processing of serial crystallography images

Journal of Synchrotron Radiation ◽

10.1107/s1600577520008140 ◽

2020 ◽

Vol 27 (5) ◽

pp. 1297-1306

Author(s):

Raphael Ponsard ◽

Nicolas Janvier ◽

Jerome Kieffer ◽

Dominique Houzet ◽

Vincent Fristot

Keyword(s):

High Throughput ◽

High Performance ◽

Data Transfer ◽

Direct Memory Access ◽

Large Data ◽

Processing Unit ◽

Image Correction ◽

Online Data ◽

Online Processing ◽

Serial Crystallography

The continual evolution of photon sources and high-performance detectors drives cutting-edge experiments that can produce very high throughput data streams and generate large data volumes that are challenging to manage and store. In these cases, efficient data transfer and processing architectures that allow online image correction, data reduction or compression become fundamental. This work investigates different technical options and methods for data placement from the detector head to the processing computing infrastructure, taking into account the particularities of modern modular high-performance detectors. In order to compare realistic figures, the future ESRF beamline dedicated to macromolecular X-ray crystallography, EBSL8, is taken as an example, which will use a PSI JUNGFRAU 4M detector generating up to 16 GB of data per second, operating continuously during several minutes. Although such an experiment seems possible at the target speed with the 100 Gb s−1 network cards that are currently available, the simulations generated highlight some potential bottlenecks when using a traditional software stack. An evaluation of solutions is presented that implements remote direct memory access (RDMA) over converged ethernet techniques. A synchronization mechanism is proposed between a RDMA network interface card (RNIC) and a graphics processing unit (GPU) accelerator in charge of the online data processing. The placement of the detector images onto the GPU is made to overlap with the computation carried out, potentially hiding the transfer latencies. As a proof of concept, a detector simulator and a backend GPU receiver with a rejection and compression algorithm suitable for a synchrotron serial crystallography (SSX) experiment are developed. It is concluded that the available transfer throughput from the RNIC to the GPU accelerator is at present the major bottleneck in online processing for SSX experiments.

Download Full-text

A Dynamic Programmable Network for Large-Scale Scientific Data Transfer Using AmoebaNet

Applied Sciences ◽

10.3390/app9214541 ◽

2019 ◽

Vol 9 (21) ◽

pp. 4541

Author(s):

Syed Asif Raza Shah ◽

Seo-Young Noh

Keyword(s):

High Performance ◽

Large Scale ◽

Data Transfer ◽

Scientific Data ◽

Network Service ◽

Bulk Data ◽

Experimental Facilities ◽

Bulk Data Transfer ◽

Programmable Network ◽

Extreme Scale

Large scientific experimental facilities currently are generating a tremendous amount of data. In recent years, the significant growth of scientific data analysis has been observed across scientific research centers. Scientific experimental facilities are producing an unprecedented amount of data and facing new challenges to transfer the large data sets across multi continents. In particular, these days the data transfer is playing an important role in new scientific discoveries. The performance of distributed scientific environment is highly dependent on high-performance, adaptive, and robust network service infrastructures. To support large scale data transfer for extreme-scale distributed science, there is the need of high performance, scalable, end-to-end, and programmable networks that enable scientific applications to use the networks efficiently. We worked on the AmoebaNet solution to address the problems of a dynamic programmable network for bulk data transfer in extreme-scale distributed science environments. A major goal of the AmoebaNet project is to apply software-defined networking (SDN) technology to provide “Application-aware” network to facilitate bulk data transfer. We have prototyped AmoebaNet’s SDN-enabled network service that allows application to dynamically program the networks at run-time for bulk data transfers. In this paper, we evaluated AmoebaNet solution with real world test cases and shown that how it efficiently and dynamically can use the networks for bulk data transfer in large-scale scientific environments.

Download Full-text

The Modern Research Data Portal: A design pattern for networked, data-intensive science

10.7287/peerj.preprints.3194v2 ◽

2017 ◽

Cited By ~ 1

Author(s):

Kyle Chard ◽

Eli Dart ◽

Ian Foster ◽

David Shifflett ◽

Steven Tuecke ◽

...

Keyword(s):

Best Practices ◽

Data Storage ◽

Design Pattern ◽

High Speed ◽

High Performance ◽

Data Transfer ◽

Large Data ◽

Research Data ◽

Control Logic ◽

Data Portal

We describe best practices for providing convenient, high-speed, secure access to large data via research data portals. We capture these best practices in a new design pattern, the Modern Research Data Portal, that disaggregates the traditional monolithic web-based data portal to achieve orders-of-magnitude increases in data transfer performance, support new deployment architectures that decouple control logic from data storage, and reduce development and operations costs. We introduce the design pattern; explain how it leverages high-performance Science DMZs and cloud-based data management services; review representative examples at research laboratories and universities, including both experimental facilities and supercomputer sites; describe how to leverage Python APIs for authentication, authorization, data transfer, and data sharing; and use coding examples to demonstrate how these APIs can be used to implement a range of research data portal capabilities. Sample code at a companion web site, https://docs.globus.org/mrdp, provides application skeletons that readers can adapt to realize their own research data portals.

Download Full-text

The Modern Research Data Portal: A design pattern for networked, data-intensive science

10.7287/peerj.preprints.3194v1 ◽

2017 ◽

Author(s):

Kyle Chard ◽

Eli Dart ◽

Ian Foster ◽

David Shifflett ◽

Steven Tuecke ◽

...

Keyword(s):

Best Practices ◽

Data Storage ◽

Design Pattern ◽

High Speed ◽

High Performance ◽

Data Transfer ◽

Large Data ◽

Research Data ◽

Control Logic ◽

Data Portal

We describe best practices for providing convenient, high-speed, secure access to large data via research data portals. We capture these best practices in a new design pattern, the Modern Research Data Portal, that disaggregates the traditional monolithic web-based data portal to achieve orders-of-magnitude increases in data transfer performance, support new deployment architectures that decouple control logic from data storage, and reduce development and operations costs. We introduce the design pattern; explain how it leverages high-performance Science DMZs and cloud-based data management services; review representative examples at research laboratories and universities, including both experimental facilities and supercomputer sites; describe how to leverage Python APIs for authentication, authorization, data transfer, and data sharing; and use coding examples to demonstrate how these APIs can be used to implement a range of research data portal capabilities. Sample code at a companion web site, https://docs.globus.org/mrdp, provides application skeletons that readers can adapt to realize their own research data portals.

Download Full-text

Reliable Blast UDP : predictable high performance bulk data transfer

Proceedings. IEEE International Conference on Cluster Computing ◽

10.1109/clustr.2002.1137760 ◽

2003 ◽

Cited By ~ 29

Author(s):

E. He ◽

J. Leigh ◽

O. Yu ◽

T.A. Defanti

Keyword(s):

High Performance ◽

Data Transfer ◽

Bulk Data ◽

Bulk Data Transfer

Download Full-text

The Modern Research Data Portal: A design pattern for networked, data-intensive science

10.7287/peerj.preprints.3194 ◽

2017 ◽

Author(s):

Kyle Chard ◽

Eli Dart ◽

Ian Foster ◽

David Shifflett ◽

Steven Tuecke ◽

...

Keyword(s):

Best Practices ◽

Data Storage ◽

Design Pattern ◽

High Speed ◽

High Performance ◽

Data Transfer ◽

Large Data ◽

Research Data ◽

Control Logic ◽

Data Portal

We describe best practices for providing convenient, high-speed, secure access to large data via research data portals. We capture these best practices in a new design pattern, the Modern Research Data Portal, that disaggregates the traditional monolithic web-based data portal to achieve orders-of-magnitude increases in data transfer performance, support new deployment architectures that decouple control logic from data storage, and reduce development and operations costs. We introduce the design pattern; explain how it leverages high-performance Science DMZs and cloud-based data management services; review representative examples at research laboratories and universities, including both experimental facilities and supercomputer sites; describe how to leverage Python APIs for authentication, authorization, data transfer, and data sharing; and use coding examples to demonstrate how these APIs can be used to implement a range of research data portal capabilities. Sample code at a companion web site, https://docs.globus.org/mrdp, provides application skeletons that readers can adapt to realize their own research data portals.

Download Full-text

A Novel Congestion Control Algorithm for High Performance Bulk Data Transfer

10.1109/nca.2009.23 ◽

2009 ◽

Cited By ~ 3

Author(s):

Yongmao Ren ◽

Haina Tang ◽

Jun Li ◽

Hualin Qian

Keyword(s):

Congestion Control ◽

High Performance ◽

Control Algorithm ◽

Data Transfer ◽

Congestion Control Algorithm ◽

Bulk Data ◽

Bulk Data Transfer

Download Full-text

Application Based Rate Controllable TCP for High Performance Bulk Data Transfer

2012 IEEE 14th International Conference on High Performance Computing and Communication & 2012 IEEE 9th International Conference on Embedded Software and Systems ◽

10.1109/hpcc.2012.93 ◽

2012 ◽

Cited By ~ 8

Author(s):

Guodong Wang ◽

Yongmao Ren ◽

Jun Li

Keyword(s):

High Performance ◽

Data Transfer ◽

Bulk Data ◽

Bulk Data Transfer

Download Full-text

Bulk data transfer distributer: a high performance multicast model in ALMA ACS

10.1117/12.671159 ◽

2006 ◽

Cited By ~ 1

Author(s):

R. Cirami ◽

P. Di Marcantonio ◽

G. Chiozzi ◽

B. Jeram

Keyword(s):

High Performance ◽

Data Transfer ◽

Bulk Data ◽

Bulk Data Transfer

Download Full-text

Fast In-Memory Key–Value Cache System with RDMA

Journal of Circuits System and Computers ◽

10.1142/s0218126619500749 ◽

2019 ◽

Vol 28 (05) ◽

pp. 1950074

Author(s):

Wei Chen ◽

Songping Yu ◽

Zhiying Wang

Keyword(s):

High Performance ◽

Data Transfer ◽

Fog Computing ◽

Direct Memory Access ◽

Data Access ◽

Cache Memory ◽

Remote Memory ◽

Low Latency ◽

Remote System ◽

Memory Compaction

The quick advances of Cloud and the advent of Fog computing impose more and more critical demand for computing and data transfer of low latency onto the underlying distributed computing infrastructure. Remote direct memory access (RDMA) technology has been widely applied for its low latency of remote data access. However, RDMA gives rise to a host of challenges in accelerating in-memory key–value stores, such as direct remote memory writes, making the remote system more vulnerable. This study presents an in-memory key–value system based on RDMA, named Craftscached, which enables: (1) buffering remote memory writes into a communication cache memory to eliminate direct remote memory writes to the data memory area; (2) dividing the communication cache memory into RDMA-writable and RDMA-readable memory zones to reduce the possibility of data corruption due to stray memory writes and caching data into an RDMA-readable memory zone to improve the remote memory read performance; and (3) adopting remote out-of-place direct memory write to achieve high performance of remote read and write. Experimental results in comparison with Memcached indicate that Craftscached provides a far better performance: (1) in the case of read-intensive workloads, the data access of Craftscached is about 7–43[Formula: see text] and 18–72.4% better than those of TCP/IP-based and RDMA-based Memcached, respectively; (2) the memory utilization of small objects is more efficient with only about 3.8% memory compaction overhead.

Download Full-text