An Approach for Evaluating and Mitigating Intra-Application I/O Performance Variability Over Parallel File Systems

Mapping Intimacies ◽

10.5753/wscad_estendido.2019.8709 ◽

2019 ◽

Author(s):

Eduardo Inacio ◽

Mario Antonio Dantas

Keyword(s):

Load Distribution ◽

Large Scale ◽

File System ◽

Storage System ◽

Research Work ◽

File Systems ◽

Distributed Applications ◽

Evaluation Tool ◽

Performance Variability ◽

Parallel File

To meet ever increasing capacity and performance requirements of emerging data-intensive applications, highly distributed and multilayered back-end storage systems have been employed in large-scale high performance computing (HPC) environments. A main component of these storage infrastructures is the parallel file system (PFS), a especially designed file system for absorbing bulk data transfers from applications with thousands of concurrent processes. Load distribution on PFS data servers compose a major source of intra-application input/output (I/O) performance variability. Albeit mitigating variability is desirable, as it is known to harm application-perceived performance, understanding and dealing with I/O performance variability in such complex environments remains a challenging task. In this research, a differentiated approach for evaluating and mitigating intra-application I/O performance variability over PFSs is proposed. More specifically, from the evaluation perspective, a comprehensive approach combining complementary methods is proposed. An analytical model proposal, named DTSMaxLoad, provides estimates for the maximum load in a PFS data server. To complement DTSMaxLoad, modeling conditions and mechanisms hard to represent analytically, the Parallel I/O and Storage System (PIOSS) simulation model was proposed. Finally, for experimental evaluation over real environments, a flexible and distributed I/O performance evaluation tool, coined as IOR-Extended (IORE), was proposed. Furthermore, a high-level file distribution approach for PFSs, called N-N Round-Robin (N2R2), was proposed focusing on mitigating I/O performance variability for distributed applications where each process accesses an individual and independent file. An extensive experimental effort, including measurements on real environments, was conducted in this research work for evaluating each of the proposed approaches. In summary, this evaluation indicated both DTSMaxLoad and PIOSS modeling proposals can represent load distribution behavior on PFSs with significant fidelity. Moreover, results demonstrated N2R2 successfully reduced intra-application I/O performance variability for 270 distinct experimental scenarios, which, ultimately, translated into overall application I/O performance Improvements.

Download Full-text

DMFSsim: A Distributed Metadata File System Simulator

Applied Mechanics and Materials ◽

10.4028/www.scientific.net/amm.241-244.1556 ◽

2012 ◽

Vol 241-244 ◽

pp. 1556-1561

Author(s):

Qi Meng Wu ◽

Ke Xie ◽

Ming Fa Zhu ◽

Li Min Xiao ◽

Li Ruan

Keyword(s):

Large Scale ◽

File System ◽

File Systems ◽

Performance Gain ◽

Parallel File Systems ◽

Management Mechanism ◽

Metadata File ◽

Parallel File ◽

Distribution Algorithms

Parallel file systems deploy multiple metadata servers to distribute heavy metadata workload from clients. With the increasing number of metadata servers, metadata-intensive operations are facing some problems related with collaboration among them, compromising the performance gain. Consequently, a file system simulator is very helpful to try out some optimization ideas to solve these problems. In this paper, we propose DMFSsim to simulate the metadata-intensive operations on large-scale distributed metadata file systems. DMFSsim can flexibly replay traces of multiple metadata operations, support several commonly used metadata distribution algorithms, simulate file system tree hierarchy and underlying disk blocks management mechanism in real systems. Extensive simulations show that DMFSsim is capable of demonstrating the performance of metadata-intensive operations in distributed metadata file system.

Download Full-text

Rethinking key–value store for parallel I/O optimization

The International Journal of High Performance Computing Applications ◽

10.1177/1094342016677084 ◽

2016 ◽

Vol 31 (4) ◽

pp. 335-356 ◽

Cited By ~ 4

Author(s):

Anthony Kougkas ◽

Hassan Eslami ◽

Xian-He Sun ◽

Rajeev Thakur ◽

William Gropp

Keyword(s):

Cloud Storage ◽

Large Scale ◽

Storage Systems ◽

Storage System ◽

File Systems ◽

Data Synchronization ◽

Input Output ◽

Parallel File Systems ◽

Parallel File ◽

And Performance

Key–value stores are being widely used as the storage system for large-scale internet services and cloud storage systems. However, they are rarely used in HPC systems, where parallel file systems are the dominant storage solution. In this study, we examine the architecture differences and performance characteristics of parallel file systems and key–value stores. We propose using key–value stores to optimize overall Input/Output (I/O) performance, especially for workloads that parallel file systems cannot handle well, such as the cases with intense data synchronization or heavy metadata operations. We conducted experiments with several synthetic benchmarks, an I/O benchmark, and a real application. We modeled the performance of these two systems using collected data from our experiments, and we provide a predictive method to identify which system offers better I/O performance given a specific workload. The results show that we can optimize the I/O performance in HPC systems by utilizing key–value stores.

Download Full-text

A Portable and Platform Independent File System for Large Scale Peer-to-Peer Systems and Distributed Applications

Advances in Parallel and Distributed Computing and Ubiquitous Services - Lecture Notes in Electrical Engineering ◽

10.1007/978-981-10-0068-3_6 ◽

2016 ◽

pp. 47-55

Author(s):

Andreas Barbian ◽

Stefan Nothaas ◽

Timm J. Filler ◽

Michael Schoettner

Keyword(s):

Large Scale ◽

File System ◽

Distributed Applications ◽

Peer To Peer ◽

Peer To Peer Systems

Download Full-text

High Performance Storage for Big Data Analytics and Visualization

Advances in Data Mining and Database Management - Handbook of Research on Big Data Storage and Visualization Techniques ◽

10.4018/978-1-5225-3142-5.ch010 ◽

2018 ◽

pp. 254-275

Author(s):

Armando Fandango ◽

William Rivera

Keyword(s):

Big Data ◽

High Speed ◽

High Performance ◽

File System ◽

Predictive Analytics ◽

Big Data Analytics ◽

File Systems ◽

Distributed Applications ◽

System Level ◽

File Formats

Scientific Big Data being gathered at exascale needs to be stored, retrieved and manipulated. The storage stack for scientific Big Data includes a file system at the system level for physical organization of the data, and a file format and input/output (I/O) system at the application level for logical organization of the data; both of them of high-performance variety for exascale. The high-performance file system is designed with concurrent access, high-speed transmission and fault tolerance characteristics. High-performance file formats and I/O are designed to allow parallel and distributed applications with easy and fast access to Big Data. These specialized file formats make it easier to store and access Big Data for scientific visualization and predictive analytics. This chapter provides a brief review of the characteristics of high-performance file systems such as Lustre and GPFS, and high-performance file formats such as HDF5, NetCDF, MPI-IO, and HDFS.

Download Full-text

A Statistical Analysis of the Performance Variability of Read/Write Operations on Parallel File Systems

Procedia Computer Science ◽

10.1016/j.procs.2017.05.026 ◽

2017 ◽

Vol 108 ◽

pp. 2393-2397 ◽

Cited By ~ 5

Author(s):

Eduardo C. Inacio ◽

Pedro A. Barbetta ◽

Mario A.R. Dantas

Keyword(s):

Statistical Analysis ◽

File Systems ◽

Parallel File Systems ◽

Performance Variability ◽

Parallel File

Download Full-text

Dynamic Load Rebalancing Algorithm for Private Cloud

Applied Mechanics and Materials ◽

10.4028/www.scientific.net/amm.573.556 ◽

2014 ◽

Vol 573 ◽

pp. 556-559

Author(s):

A. Shenbaga Bharatha Priya ◽

J. Ganesh ◽

Mareeswari M. Devi

Keyword(s):

Dynamic Load ◽

Large Scale ◽

File System ◽

Single Point ◽

File Systems ◽

Distributed File System ◽

Private Cloud ◽

Global Knowledge ◽

Load Imbalance ◽

And Storage

Infrastructure-As-A-Service (IAAS) provides an environmental setup under any type of cloud. In Distributed file system (DFS), nodes are simultaneously serve computing and storage functions; that is parallel Data Processing and storage in cloud. Here, file is considered as a data or load. That file is partitioned into a number of File chunks (FC) allocated in distinct nodes so that Map Reduce tasks can be performed in parallel over the nodes. Files and Nodes can be dynamically created, deleted, and added. This results in load imbalance in a distributed file system; that is, the file chunks are not distributed as uniformly as possible among the Chunk Servers (CS). Emerging distributed file systems in production systems strongly depend on a central node for chunk reallocation or Distributed node to maintain global knowledge of all chunks. This dependence is clearly inadequate in a large-scale, failure-prone environment because the central load balancer is put under considerable workload that is linearly scaled with the system size, it may thus become the performance bottleneck and the single point of failure and memory wastage in distributed nodes. So, we have to enhance the Client side module with server side module to create, delete and update the file chunks in Client Module. And manage the overall private cloud and apply dynamic load balancing algorithm to perform auto scaling options in private cloud. In this project, a fully distributed load rebalancing algorithm is presented to cope with the load imbalance problem.

Download Full-text