NVMM-Oriented Hierarchical Persistent Client Caching for Lustre

In high-performance computing (HPC), data and metadata are stored on special server nodes and client applications access the servers’ data and metadata through a network, which induces network latencies and resource contention. These server nodes are typically equipped with (slow) magnetic disks, while the client nodes store temporary data on fast SSDs or even on non-volatile main memory (NVMM). Therefore, the full potential of parallel file systems can only be reached if fast client side storage devices are included into the overall storage architecture. In this article, we propose an NVMM-based hierarchical persistent client cache for the Lustre file system (NVMM-LPCC for short). NVMM-LPCC implements two caching modes: a read and write mode (RW-NVMM-LPCC for short) and a read only mode (RO-NVMM-LPCC for short). NVMM-LPCC integrates with the Lustre Hierarchical Storage Management (HSM) solution and the Lustre layout lock mechanism to provide consistent persistent caching services for I/O applications running on client nodes, meanwhile maintaining a global unified namespace of the entire Lustre file system. The evaluation results presented in this article show that NVMM-LPCC can increase the average read throughput by up to 35.80 times and the average write throughput by up to 9.83 times compared with the native Lustre system, while providing excellent scalability.

Download Full-text

Exploring Efficient Architectures on Remote In-Memory NVM over RDMA

ACM Transactions on Embedded Computing Systems ◽

10.1145/3477004 ◽

2021 ◽

Vol 20 (5s) ◽

pp. 1-20

Author(s):

Qingfeng Zhuge ◽

Hao Zhang ◽

Edwin Hsing-Mean Sha ◽

Rui Xu ◽

Jun Liu ◽

...

Keyword(s):

High Performance ◽

File System ◽

File Systems ◽

Data Access ◽

Main Memory ◽

Memory Modules ◽

Significant Performance ◽

Architectural Structures ◽

Memory Accesses ◽

Careful Design

Efficiently accessing remote file data remains a challenging problem for data processing systems. Development of technologies in non-volatile dual in-line memory modules (NVDIMMs), in-memory file systems, and RDMA networks provide new opportunities towards solving the problem of remote data access. A general understanding about NVDIMMs, such as Intel Optane DC Persistent Memory (DCPM), is that they expand main memory capacity with a cost of multiple times lower performance than DRAM. With an in-depth exploration presented in this paper, however, we show an interesting finding that the potential of NVDIMMs for high-performance, remote in-memory accesses can be revealed through careful design. We explore multiple architectural structures for accessing remote NVDIMMs in a real system using Optane DCPM, and compare the performance of various structures. Experiments are conducted to show significant performance gaps among different ways of using NVDIMMs as memory address space accessible through RDMA interface. Furthermore, we design and implement a prototype of user-level, in-memory file system, RIMFS, in the device DAX mode on Optane DCPM. By comparing against the DAX-supported Linux file system, Ext4-DAX, we show that the performance of remote reads on RIMFS over RDMA is 11.44 higher than that on a remote Ext4-DAX on average. The experimental results also show that the performance of remote accesses on RIMFS is maintained on a heavily loaded data server with CPU utilization as high as 90%, while the performance of remote reads on Ext4-DAX is significantly reduced by 49.3%, and the performance of local reads on Ext4-DAX is even more significantly reduced by 90.1%. The performance comparisons of writes exhibit the same trends.

Download Full-text

InK: In-Kernel Key-Value Storage with Persistent Memory

Electronics ◽

10.3390/electronics9111913 ◽

2020 ◽

Vol 9 (11) ◽

pp. 1913

Author(s):

Minjong Ha ◽

Sang-Hoon Kim

Keyword(s):

High Performance ◽

File Systems ◽

Cloud Service ◽

Main Memory ◽

Storage Device ◽

Storage Devices ◽

Persistent Memory ◽

Non Volatile Memory ◽

Long Time ◽

Block Based

Block-based storage devices exhibit different characteristics from main memory, and applications and systems have been optimized for a long time considering the characteristics in mind. However, emerging non-volatile memory technologies are about to change the situation. Persistent Memory (PM) provides a huge, persistent, and byte-addressable address space to the system, thereby enabling new opportunities for systems software. However, existing applications are usually apt to indirectly utilize PM as a storage device on top of file systems. This makes applications and file systems perform unnecessary operations and amplify I/O traffic, thereby under-utilizing the high performance of PM. In this paper, we make the case for an in-Kernel key-value storage service optimized for PM, called InK. While providing the persistence of data at a high performance, InK considers the characteristics of PM to guarantee the crash consistency. To this end, InK indexes key-value pairs with B+ tree, which is more efficient on PM. We implemented InK based on the Linux kernel and evaluated its performance with Yahoo Cloud Service Benchmark (YCSB) and RocksDB. Evaluation results confirms that InK has advantages over LSM-tree-based key-value store systems in terms of throughput and tail latency.

Download Full-text

Octopus + : An RDMA-Enabled Distributed Persistent Memory File System

ACM Transactions on Storage ◽

10.1145/3448418 ◽

2021 ◽

Vol 17 (3) ◽

pp. 1-25

Author(s):

Bohong Zhu ◽

Youmin Chen ◽

Qing Wang ◽

Youyou Lu ◽

Jiwu Shu

Keyword(s):

High Speed ◽

High Performance ◽

File System ◽

Direct Memory Access ◽

File Systems ◽

Distributed File Systems ◽

Persistent Memory ◽

Memory Modules ◽

Non Volatile Memory ◽

Volatile Memory

Non-volatile memory and remote direct memory access (RDMA) provide extremely high performance in storage and network hardware. However, existing distributed file systems strictly isolate file system and network layers, and the heavy layered software designs leave high-speed hardware under-exploited. In this article, we propose an RDMA-enabled distributed persistent memory file system, Octopus + , to redesign file system internal mechanisms by closely coupling non-volatile memory and RDMA features. For data operations, Octopus + directly accesses a shared persistent memory pool to reduce memory copying overhead, and actively fetches and pushes data all in clients to rebalance the load between the server and network. For metadata operations, Octopus + introduces self-identified remote procedure calls for immediate notification between file systems and networking, and an efficient distributed transaction mechanism for consistency. Octopus + is enabled with replication feature to provide better availability. Evaluations on Intel Optane DC Persistent Memory Modules show that Octopus + achieves nearly the raw bandwidth for large I/Os and orders of magnitude better performance than existing distributed file systems.

Download Full-text

Scalable Distributed Metadata Server Based on Nonblocking Transactions

JUCS - Journal of Universal Computer Science ◽

10.3897/jucs.2020.006 ◽

2020 ◽

Vol 26 (1) ◽

pp. 89-106

Author(s):

Kohei Hiraga ◽

Osamu Tatebe ◽

Hideyuki Kawashima

Keyword(s):

High Performance ◽

File Systems ◽

Software Transactional Memory ◽

Server Side ◽

Multiple Servers ◽

Distributed Transaction ◽

Lock Mode ◽

Dynamic Software ◽

Metadata Server ◽

Performance Computing

Metadata performance scalability is critically important in high-performance computing when accessing many small files from millions of clients. This paper proposes a design of a scalable distributed metadata server, PPMDS, for parallel file systems using multiple key-value servers. In PPMDS, hierarchical namespace of a file system is efficiently managed by multiple servers. Multiple entries can be atomically updated using a nonblocking distributed transaction based on an algorithm of dynamic software transactional memory. This paper also proposes optimizations to further improve the metadata performance by introducing a server-side transaction processing, multiple readers, and a shared lock mode, which reduce the number of remote procedure calls and prevent unnecessary blocking. Performance evaluation shows the scalable performance up to 3 servers, and achieves 62,000 operations per second, which is 2.58x performance improvement compared to a single metadata performance.

Download Full-text

On the benefits of a workflow-aware file system in high-performance computing systems

Eighth International Conference on High-Performance Computing in Asia-Pacific Region (HPCASIA'05) ◽

10.1109/hpcasia.2005.58 ◽

2005 ◽

Author(s):

Yang Wang ◽

P. Lu

Keyword(s):

High Performance Computing ◽

High Performance ◽

File System ◽

Computing Systems ◽

Performance Computing

Download Full-text

High Performance Storage for Big Data Analytics and Visualization

Advances in Data Mining and Database Management - Handbook of Research on Big Data Storage and Visualization Techniques ◽

10.4018/978-1-5225-3142-5.ch010 ◽

2018 ◽

pp. 254-275

Author(s):

Armando Fandango ◽

William Rivera

Keyword(s):

Big Data ◽

High Speed ◽

High Performance ◽

File System ◽

Predictive Analytics ◽

Big Data Analytics ◽

File Systems ◽

Distributed Applications ◽

System Level ◽

File Formats

Scientific Big Data being gathered at exascale needs to be stored, retrieved and manipulated. The storage stack for scientific Big Data includes a file system at the system level for physical organization of the data, and a file format and input/output (I/O) system at the application level for logical organization of the data; both of them of high-performance variety for exascale. The high-performance file system is designed with concurrent access, high-speed transmission and fault tolerance characteristics. High-performance file formats and I/O are designed to allow parallel and distributed applications with easy and fast access to Big Data. These specialized file formats make it easier to store and access Big Data for scientific visualization and predictive analytics. This chapter provides a brief review of the characteristics of high-performance file systems such as Lustre and GPFS, and high-performance file formats such as HDF5, NetCDF, MPI-IO, and HDFS.

Download Full-text

LEARNING SOLUTIONS WITH CLOUD TECHNOLOGIES

Jurnal Teknologi ◽

10.11113/jt.v78.8921 ◽

2016 ◽

Vol 78 (6-3) ◽

Author(s):

Tatiana Zudilova ◽

Svetlana Odinochkina ◽

Victor Prygun

Keyword(s):

High Performance ◽

Cloud Model ◽

Private Cloud ◽

New Approach ◽

Storage Devices ◽

Training Courses ◽

Ict Training ◽

Cloud Technologies ◽

Programming Courses ◽

Performance Computing

This paper presents a new approach to the organization of ICT training courses on the basis of the designed and developed a private training cloud prototype. The developed prototype was built on the Microsoft System Center 2012 resources, which allowed consolidating high-performance computing tools, combined different classes of storage devices and provided these resources on demand. We describe the implementation of SaaS private cloud model for ICT user courses and PaaS model - for ICT programming courses.

Download Full-text

Ad Hoc File Systems for High-Performance Computing

Journal of Computer Science and Technology ◽

10.1007/s11390-020-9801-1 ◽

2020 ◽

Vol 35 (1) ◽

pp. 4-26 ◽

Cited By ~ 2

Author(s):

André Brinkmann ◽

Kathryn Mohror ◽

Weikuan Yu ◽

Philip Carns ◽

Toni Cortes ◽

...

Keyword(s):

High Performance Computing ◽

High Performance ◽

Ad Hoc ◽

File Systems ◽

Performance Computing

Download Full-text

Performance Analysis of Lustre File System using High Performance Storage Devices

KIISE Transactions on Computing Practices ◽

10.5626/ktcp.2016.22.4.163 ◽

2016 ◽

Vol 22 (4) ◽

pp. 163-169 ◽

Cited By ~ 2

Author(s):

Jaehwan Lee ◽

Donghun Koo ◽

Kyungmin Park ◽

Jiksoo Kim ◽

Soonwook Hwang

Keyword(s):

Performance Analysis ◽

High Performance ◽

File System ◽

Storage Devices

Download Full-text

Special Issue on Automatic Application Tuning for HPC Architectures

Scientific Programming ◽

10.1155/2014/208480 ◽

2014 ◽

Vol 22 (4) ◽

pp. 259-260 ◽

Cited By ~ 1

Author(s):

Siegfried Benkner ◽

Franz Franchetti ◽

Hans Michael Gerndt ◽

Jeffrey K. Hollingsworth

Keyword(s):

High Performance ◽

Research Field ◽

Performance Tuning ◽

Full Potential ◽

Research Groups ◽

Special Issue ◽

The World ◽

And Performance ◽

Application Tuning ◽

Performance Computing

High Performance Computing architectures have become incredibly complex and exploiting their full potential is becoming more and more challenging. As a consequence, automatic performance tuning (autotuning) of HPC applications is of growing interest and many research groups around the world are currently involved. Autotuning is still a rapidly evolving research field with many different approaches being taken. This special issue features selected papers presented at the Dagstuhl seminar on “Automatic Application Tuning for HPC Architectures” in October 2013, which brought together researchers from the areas of autotuning and performance analysis in order to exchange ideas and steer future collaborations.

Download Full-text