parallel file system Latest Research Papers

2021 ◽

pp. 109434202110165

Author(s):

Shu-Mei Tseng ◽

Bogdan Nicolae ◽

Franck Cappello ◽

Aparna Chandramowlishwaran

Keyword(s):

Data Management ◽

Resource Sharing ◽

Memory Bandwidth ◽

Parallel File System ◽

Network Bandwidth ◽

Application Behavior ◽

Trade Offs ◽

Depth Study ◽

Memory Network ◽

Competition For Resources

With increasing complexity of HPC workflows, data management services need to perform expensive I/O operations asynchronously in the background, aiming to overlap the I/O with the application runtime. However, this may cause interference due to competition for resources: CPU, memory/network bandwidth. The advent of multi-core architectures has exacerbated this problem, as many I/O operations are issued concurrently, thereby competing not only with the application but also among themselves. Furthermore, the interference patterns can dynamically change as a response to variations in application behavior and I/O subsystems (e.g. multiple users sharing a parallel file system). Without a thorough understanding, I/O operations may perform suboptimally, potentially even worse than in the blocking case. To fill this gap, this paper investigates the causes and consequences of interference due to asynchronous I/O on HPC systems. Specifically, we focus on multi-core CPUs and memory bandwidth, isolating the interference due to each resource. Then, we perform an in-depth study to explain the interplay and contention in a variety of resource sharing scenarios such as varying priority and number of background I/O threads and different I/O strategies: sendfile, read/write, mmap/write underlining trade-offs. The insights from this study are important both to enable guided optimizations of existing background I/O, as well as to open new opportunities to design advanced asynchronous I/O strategies.

Download Full-text

Parallel File System Characteristics and Performance Analysis with NVMe SSD based Cache

Journal of Digital Contents Society ◽

10.9728/dcs.2020.21.1.157 ◽

2020 ◽

Vol 21 (1) ◽

pp. 157-164

Author(s):

JunWeon Yoon ◽

Ui-Sung Song

Keyword(s):

Performance Analysis ◽

File System ◽

Parallel File System ◽

Parallel File ◽

And Performance ◽

System Characteristics

Download Full-text

Benchmarking Parallel File System Sensitiveness to I/O patterns

2019 IEEE 27th International Symposium on Modeling, Analysis, and Simulation of Computer and Telecommunication Systems (MASCOTS) ◽

10.1109/mascots.2019.00054 ◽

2019 ◽

Author(s):

Konstantinos Chasapis ◽

Jean-Yves Vet ◽

Jean-Thomas Acquaviva

Keyword(s):

File System ◽

Parallel File System ◽

Parallel File

Download Full-text

Modeling high-throughput applications for in situ analytics

The International Journal of High Performance Computing Applications ◽

10.1177/1094342019847263 ◽

2019 ◽

Vol 33 (6) ◽

pp. 1185-1200

Author(s):

Guillaume Aupy ◽

Brice Goglin ◽

Valentin Honoré ◽

Bruno Raffin

Keyword(s):

Data Analysis ◽

System Performance ◽

Resource Partitioning ◽

In Situ Analysis ◽

Input Output ◽

Parallel File System ◽

Scheduling Policies ◽

Memory Constraints ◽

New Strategies

With the goal of performing exascale computing, the importance of input/output (I/O) management becomes more and more critical to maintain system performance. While the computing capacities of machines are getting higher, the I/O capabilities of systems do not increase as fast. We are able to generate more data but unable to manage them efficiently due to variability of I/O performance. Limiting the requests to the parallel file system (PFS) becomes necessary. To address this issue, new strategies are being developed such as online in situ analysis. The idea is to overcome the limitations of basic postmortem data analysis where the data have to be stored on PFS first and processed later. There are several software solutions that allow users to specifically dedicate nodes for analysis of data and distribute the computation tasks over different sets of nodes. Thus far, they rely on a manual resource partitioning and allocation by the user of tasks (simulations, analysis). In this work, we propose a memory-constraint modelization for in situ analysis. We use this model to provide different scheduling policies to determine both the number of resources that should be dedicated to analysis functions and that schedule efficiently these functions. We evaluate them and show the importance of considering memory constraints in the model. Finally, we discuss the different challenges that have to be addressed to build automatic tools for in situ analytics.

Download Full-text