scholarly journals Performance gains in an ESM using parallel ad-hoc file systems

Author(s):  
Stefan Versick ◽  
Ole Kirner ◽  
Jörg Meyer ◽  
Holger Obermaier ◽  
Mehmet Soysal

<p>Earth System Models (ESM) got much more demanding over the last years. Modelled processes got more complex and more and more processes are considered in models. In addition resolutions of the models got higher to improve weather and climate forecasts. This requires faster high performance computers (HPC) and better I/O performance.</p><p>Within our Pilot Lab Exascale Earth System Modelling (PL-EESM) we do performance analysis of the ESM EMAC using a standard Lustre file system for output and compare it to the performance using a parallel ad-hoc overlay file system. We will show the impact for two scenarios: one for todays standard amount of output and one with artificial heavy output simulating future ESMs.</p><p>An ad-hoc file system is a private parallel file system which is created on-demand for an HPC job using the node-local storage devices, in our case solid-state-disks (SSD). It only exists during the runtime of the job. Therefore output data have to be moved to a permanent file system before the job has finished. Quasi in-situ data analysis and post-processing allows to gain performance as it might result in a decreased amount of data which you have to store - saving disk space and time during the transfer of data to permanent storage. We will show first tests for quasi in-situ post-processing.</p>

2021 ◽  
Author(s):  
Stefan Versick ◽  
Thomas Fischer ◽  
Ole Kirner ◽  
Tobias Meisel ◽  
Jörg Meyer

<p>Earth System Models (ESM) got much more demanding over the last years. Modelled processes got more complex and more and more processes are considered in models. In addition resolutions of the models got higher to improve accuracy of predictions. This requires faster high performance computers (HPC) and better I/O performance. One way to improve I/O performance is to use faster file systems. Last year we showed the impact of the ad-hoc file system on the performance of the ESM EMAC. An ad-hoc file system is a private parallel file system which is created on-demand for an HPC job using the node-local storage devices, in our case solid-state-disks (SSD). It only exists during the runtime of the job. Therefore output data have to be moved to a permanent file system before the job has finished. Performance improvements are due to the use of SSDs in case of small chunks of I/O or a high amount of I/O operations per second. Another reason for a performace boost is because the running job can exclusively access the file system. To get a better overview in which cases ESMs benefit from using ad-hoc file systems we repeated our performance tests with further ESMs with different I/O strategies. In total we now analyzed EMAC (parallel netcdf), ICON2.5 (netcdf with asynchronous I/O), ICON2.6 (netcdf with Climate Data Interface (CDI) library) and OpenGeoSys (parallel VTU).</p>


2016 ◽  
Vol 16 (14) ◽  
pp. 9435-9455 ◽  
Author(s):  
Matthew J. Alvarado ◽  
Chantelle R. Lonsdale ◽  
Helen L. Macintyre ◽  
Huisheng Bian ◽  
Mian Chin ◽  
...  

Abstract. Accurate modeling of the scattering and absorption of ultraviolet and visible radiation by aerosols is essential for accurate simulations of atmospheric chemistry and climate. Closure studies using in situ measurements of aerosol scattering and absorption can be used to evaluate and improve models of aerosol optical properties without interference from model errors in aerosol emissions, transport, chemistry, or deposition rates. Here we evaluate the ability of four externally mixed, fixed size distribution parameterizations used in global models to simulate submicron aerosol scattering and absorption at three wavelengths using in situ data gathered during the 2008 Arctic Research of the Composition of the Troposphere from Aircraft and Satellites (ARCTAS) campaign. The four models are the NASA Global Modeling Initiative (GMI) Combo model, GEOS-Chem v9-02, the baseline configuration of a version of GEOS-Chem with online radiative transfer calculations (called GC-RT), and the Optical Properties of Aerosol and Clouds (OPAC v3.1) package. We also use the ARCTAS data to perform the first evaluation of the ability of the Aerosol Simulation Program (ASP v2.1) to simulate submicron aerosol scattering and absorption when in situ data on the aerosol size distribution are used, and examine the impact of different mixing rules for black carbon (BC) on the results. We find that the GMI model tends to overestimate submicron scattering and absorption at shorter wavelengths by 10–23 %, and that GMI has smaller absolute mean biases for submicron absorption than OPAC v3.1, GEOS-Chem v9-02, or GC-RT. However, the changes to the density and refractive index of BC in GC-RT improve the simulation of submicron aerosol absorption at all wavelengths relative to GEOS-Chem v9-02. Adding a variable size distribution, as in ASP v2.1, improves model performance for scattering but not for absorption, likely due to the assumption in ASP v2.1 that BC is present at a constant mass fraction throughout the aerosol size distribution. Using a core-shell mixing rule in ASP overestimates aerosol absorption, especially for the fresh biomass burning aerosol measured in ARCTAS-B, suggesting the need for modeling the time-varying mixing states of aerosols in future versions of ASP.


2021 ◽  
Vol 17 (3) ◽  
pp. 1-25
Author(s):  
Bohong Zhu ◽  
Youmin Chen ◽  
Qing Wang ◽  
Youyou Lu ◽  
Jiwu Shu

Non-volatile memory and remote direct memory access (RDMA) provide extremely high performance in storage and network hardware. However, existing distributed file systems strictly isolate file system and network layers, and the heavy layered software designs leave high-speed hardware under-exploited. In this article, we propose an RDMA-enabled distributed persistent memory file system, Octopus + , to redesign file system internal mechanisms by closely coupling non-volatile memory and RDMA features. For data operations, Octopus + directly accesses a shared persistent memory pool to reduce memory copying overhead, and actively fetches and pushes data all in clients to rebalance the load between the server and network. For metadata operations, Octopus + introduces self-identified remote procedure calls for immediate notification between file systems and networking, and an efficient distributed transaction mechanism for consistency. Octopus + is enabled with replication feature to provide better availability. Evaluations on Intel Optane DC Persistent Memory Modules show that Octopus + achieves nearly the raw bandwidth for large I/Os and orders of magnitude better performance than existing distributed file systems.


Author(s):  
Armando Fandango ◽  
William Rivera

Scientific Big Data being gathered at exascale needs to be stored, retrieved and manipulated. The storage stack for scientific Big Data includes a file system at the system level for physical organization of the data, and a file format and input/output (I/O) system at the application level for logical organization of the data; both of them of high-performance variety for exascale. The high-performance file system is designed with concurrent access, high-speed transmission and fault tolerance characteristics. High-performance file formats and I/O are designed to allow parallel and distributed applications with easy and fast access to Big Data. These specialized file formats make it easier to store and access Big Data for scientific visualization and predictive analytics. This chapter provides a brief review of the characteristics of high-performance file systems such as Lustre and GPFS, and high-performance file formats such as HDF5, NetCDF, MPI-IO, and HDFS.


2016 ◽  
Author(s):  
M. J. Alvarado ◽  
C. R. Lonsdale ◽  
H. L. Macintyre ◽  
H. Bian ◽  
M. Chin ◽  
...  

Abstract. Accurate modeling of the scattering and absorption of ultraviolet and visible radiation by aerosols is essential for accurate simulations of atmospheric chemistry and climate. Closure studies using in situ measurements of aerosol scattering and absorption can be used to evaluate and improve models of aerosol optical properties without interference from model errors in aerosol emissions, transport, chemistry, or deposition rates. Here we evaluate the ability of four externally mixed, fixed size distribution parameterizations used in global models to simulate submicron aerosol scattering and absorption at three wavelengths using in situ data gathered during the 2008 Arctic Research of the Composition of the Troposphere from Aircraft and Satellites (ARCTAS) campaign. The four models are the NASA Global Modeling Initiative (GMI) Combo model, GEOS-Chem v9-02, the baseline configuration of a version of GEOS-Chem with online radiative transfer calculations (called GC-RT), and the Optical Properties of Aerosol and Clouds (OPAC v3.1) package. We also use the ARCTAS data to perform the first evaluation of the ability of the Aerosol Simulation Program (ASP v2.1) to simulate submicron aerosol scattering and absorption when in situ data on the aerosol size distribution is used, and examine the impact of different mixing rules for black carbon (BC) on the results. We find that the GMI model tends to overestimate submicron scattering and absorption at shorter wavelengths by 10–23 %, and that GMI has smaller absolute mean biases for submicron absorption than OPAC v3.1, GEOS-Chem v9-02, or GC-RT. However, the changes to the density and refractive index of BC in GC-RT improve the simulation of submicron aerosol absorption at all wavelengths relative to GEOS-Chem v9-02. Adding in situ size distribution information, as in ASP v2.1, improves model performance for scattering but not for absorption, likely due to the assumption in ASP v2.1 that BC is present at a constant mass fraction through out the aerosol size distribution. Using a core-shell mixing state in ASP overestimates aerosol absorption, especially for the fresh biomass burning aerosol measured in ARCTAS-B, suggesting the need for time-varying mixing states in future versions of ASP.


Author(s):  
Bo Li ◽  
Ziyi Peng ◽  
Peng Hou ◽  
Min He ◽  
Marco Anisetti ◽  
...  

AbstractIn the Internet of Vehicles (IoV), with the increasing demand for intelligent technologies such as driverless driving, more and more in-vehicle applications have been put into autonomous driving. For the computationally intensive task, the vehicle self-organizing network uses other high-performance nodes in the vehicle driving environment to hand over tasks to these nodes for execution. In this way, the computational load of the cloud alleviated. However, due to the unreliability of the communication link and the dynamic changes of the vehicle environment, lengthy task completion time may lead to the increase of task failure rate. Although the flooding algorithm can improve the success rate of task completion, the offloading expend will be large. Aiming at this problem, we design the partial flooding algorithm, which is a comprehensive evaluation method based on system reliability in the vehicle computing environment without infrastructure. Using V2V link to select some nodes with better performance for partial flooding offloading to reduce the task complete time, improve system reliability and cut down the impact of vehicle mobility on offloading. The results show that the proposed offloading strategy can not only improve the utilization of computing resources, but also promote the offloading performance of the system.


2020 ◽  
Vol 35 (1) ◽  
pp. 4-26 ◽  
Author(s):  
André Brinkmann ◽  
Kathryn Mohror ◽  
Weikuan Yu ◽  
Philip Carns ◽  
Toni Cortes ◽  
...  

2020 ◽  
Author(s):  
Dirk Barbi ◽  
Nadine Wieters ◽  
Paul Gierz ◽  
Fatemeh Chegini ◽  
Sara Khosravi ◽  
...  

Abstract. Earth system and climate modelling involves the simulation of processes on a wide range of scales and within and across various components of the Earth system. In practice, component models are often developed independently by different research groups and then combined using a dedicated coupling software. This procedure not only leads to a strongly growing number of available versions of model components and coupled setups but also to model- and system-dependent ways of obtaining and operating them. Therefore, implementing these Earth System Models (ESMs) can be challenging and extremely time-consuming, especially for less experienced modellers, or scientists aiming to use different ESMs as in the case of inter-comparison projects. To assist researchers and modellers by reducing avoidable complexity, we developed the ESM-Tools software, which provides a standard way for downloading, configuring, compiling, running and monitoring different models - coupled ESMs and stand-alone models alike - on a variety of High-Performance Computing (HPC) systems. (The ESM-Tools are equally applicable and helpful for stand-alone as for coupled models. In fact, the ESM-Tools are used as standard compile and runtime infrastructure for FESOM2, and currently also applied for ECHAM and ICON standalone simulations. As coupled ESMs are technically the more challenging tasks, we will focus on coupled setups, always implying that stand-alone models can benefit in the same way.) With the ESM-Tools, the user is only required to provide a short script consisting of only the experiment specific definitions, while the software executes all the phases of a simulation in the correct order. The software, which is well documented and easy to install and use, currently supports four ocean models, three atmosphere models, two biogeochemistry models, an ice sheet model, an isostatic adjustment model, a hydrology model and a land-surface model. ESM-Tools has been entirely re-coded in a high-level programming language (Python) and provides researchers with an even more user-friendly interface for Earth system modelling lately. The ESM-Tools were developed within the framework of the project Advanced Earth System Model Capacity, supported by the Helmholtz Association.


2018 ◽  
Vol 210 ◽  
pp. 04042
Author(s):  
Ammar Alhaj Ali ◽  
Pavel Varacha ◽  
Said Krayem ◽  
Roman Jasek ◽  
Petr Zacek ◽  
...  

Nowadays, a wide set of systems and application, especially in high performance computing, depends on distributed environments to process and analyses huge amounts of data. As we know, the amount of data increases enormously, and the goal to provide and develop efficient, scalable and reliable storage solutions has become one of the major issue for scientific computing. The storage solution used by big data systems is Distributed File Systems (DFSs), where DFS is used to build a hierarchical and unified view of multiple file servers and shares on the network. In this paper we will offer Hadoop Distributed File System (HDFS) as DFS in big data systems and we will present an Event-B as formal method that can be used in modeling, where Event-B is a mature formal method which has been widely used in a number of industry projects in a number of domains, such as automotive, transportation, space, business information, medical device and so on, And will propose using the Rodin as modeling tool for Event-B, which integrates modeling and proving as well as the Rodin platform is open source, so it supports a large number of plug-in tools.


2021 ◽  
Vol 17 (1) ◽  
pp. 1-22
Author(s):  
Wen Cheng ◽  
Chunyan Li ◽  
Lingfang Zeng ◽  
Yingjin Qian ◽  
Xi Li ◽  
...  

In high-performance computing (HPC), data and metadata are stored on special server nodes and client applications access the servers’ data and metadata through a network, which induces network latencies and resource contention. These server nodes are typically equipped with (slow) magnetic disks, while the client nodes store temporary data on fast SSDs or even on non-volatile main memory (NVMM). Therefore, the full potential of parallel file systems can only be reached if fast client side storage devices are included into the overall storage architecture. In this article, we propose an NVMM-based hierarchical persistent client cache for the Lustre file system (NVMM-LPCC for short). NVMM-LPCC implements two caching modes: a read and write mode (RW-NVMM-LPCC for short) and a read only mode (RO-NVMM-LPCC for short). NVMM-LPCC integrates with the Lustre Hierarchical Storage Management (HSM) solution and the Lustre layout lock mechanism to provide consistent persistent caching services for I/O applications running on client nodes, meanwhile maintaining a global unified namespace of the entire Lustre file system. The evaluation results presented in this article show that NVMM-LPCC can increase the average read throughput by up to 35.80 times and the average write throughput by up to 9.83 times compared with the native Lustre system, while providing excellent scalability.


Sign in / Sign up

Export Citation Format

Share Document