Performance characterization of containerization for HPC workloads on InfiniBand clusters: an empirical study

Cluster Computing ◽

10.1007/s10586-021-03460-8 ◽

2021 ◽

Author(s):

Peini Liu ◽

Jordi Guitart

Keyword(s):

High Performance ◽

Virtual Machines ◽

Memory Access ◽

Memory Allocation ◽

Bare Metal ◽

Fine Grain ◽

The Impact ◽

Performance Computing ◽

Memory Affinity

AbstractContainerization technology offers an appealing alternative for encapsulating and operating applications (and all their dependencies) without being constrained by the performance penalties of using Virtual Machines and, as a result, has got the interest of the High-Performance Computing (HPC) community to obtain fast, customized, portable, flexible, and reproducible deployments of their workloads. Previous work on this area has demonstrated that containerized HPC applications can exploit InfiniBand networks, but has ignored the potential of multi-container deployments which partition the processes that belong to each application into multiple containers in each host. Partitioning HPC applications has demonstrated to be useful when using virtual machines by constraining them to a single NUMA (Non-Uniform Memory Access) domain. This paper conducts a systematical study on the performance of multi-container deployments with different network fabrics and protocols, focusing especially on Infiniband networks. We analyze the impact of container granularity and its potential to exploit processor and memory affinity to improve applications’ performance. Our results show that default Singularity can achieve near bare-metal performance but does not support fine-grain multi-container deployments. Docker and Singularity-instance have similar behavior in terms of the performance of deployment schemes with different container granularity and affinity. This behavior differs for the several network fabrics and protocols, and depends as well on the application communication patterns and the message size. Moreover, deployments on Infiniband are also more impacted by the computation and memory allocation, and because of that, they can exploit the affinity better.

Download Full-text

The Impact of Virtualization on High Performance Computing Clustering in the Cloud

International Journal of Distributed Systems and Technologies ◽

10.4018/ijdst.2015100104 ◽

2015 ◽

Vol 6 (4) ◽

pp. 65-81 ◽

Cited By ~ 6

Author(s):

Ouidad Achahbar ◽

Mohamed Riduan Abid

Keyword(s):

Big Data ◽

High Performance Computing ◽

High Performance ◽

Virtual Machines ◽

Performance Trends ◽

Tightly Coupled ◽

Physical Clusters ◽

User Friendly ◽

The Impact ◽

Performance Computing

The ongoing pervasiveness of Internet access is intensively increasing Big Data production. This, in turn, increases demand on compute power to process this massive data, and thus rendering High Performance Computing (HPC) into a high solicited service. Based on the paradigm of providing computing as a utility, the Cloud is offering user-friendly infrastructures for processing Big Data, e.g., High Performance Computing as a Service (HPCaaS). Still, HPCaaS performance is tightly coupled with the underlying virtualization technique since the latter is responsible for the creation of virtual machines that carry out data processing jobs. In this paper, the authors evaluate the impact of virtualization on HPCaaS. They track HPC performance under different Cloud virtualization platforms, namely KVM and VMware-ESXi, and compare it against physical clusters. Each tested cluster provided different performance trends. Yet, the overall analysis of the findings proved that the selection of virtualization technology can lead to significant improvements when handling HPCaaS.

Download Full-text

Identification and Characterization of Memory Allocation Anomalies in High-Performance Computing Applications

10.5753/wscad.2019.8652 ◽

2019 ◽

Cited By ~ 1

Author(s):

Antonio Tadeu Gomes ◽

Enzo Molion ◽

Roberto Pinto Souto ◽

Jean François Méhaut

Keyword(s):

Finite Element Method ◽

High Performance Computing ◽

High Performance ◽

Memory Allocation ◽

Development Cycle ◽

Memory Footprint ◽

Performance Computing ◽

Identification And Characterization ◽

Numerical Simulator

A memory allocation anomaly occurs when the allocation of a set of heap blocks imposes an unnecessary overhead on the execution of an application. In this paper, we propose a method for identifying, locating, characterizing and fixing allocation anomalies, and a tool for developers to apply the method. We experiment our method and tool with a numerical simulator aimed at approximating the solutions to partial differential equations using a finite element method. We show that taming allocation anomalies in this simulator reduces the memory footprint of its processes by 37.27% and the execution time by 16.52%. We conclude that the developer of high-performance computing applications can benefit from the method and tool during the software development cycle.

Download Full-text

The Impact of Virtualization on High Performance Computing Clustering in the Cloud

Big Data ◽

10.4018/978-1-4666-9840-6.ch077 ◽

2016 ◽

pp. 1687-1704

Author(s):

Ouidad Achahbar ◽

Mohamed Riduan Abid

Keyword(s):

Big Data ◽

High Performance Computing ◽

High Performance ◽

Virtual Machines ◽

Performance Trends ◽

Tightly Coupled ◽

Physical Clusters ◽

User Friendly ◽

The Impact ◽

Performance Computing

Download Full-text

vCUDA: GPU accelerated high performance computing in virtual machines

2009 IEEE International Symposium on Parallel & Distributed Processing ◽

10.1109/ipdps.2009.5161020 ◽

2009 ◽

Cited By ~ 20

Author(s):

Lin Shi ◽

Hao Chen ◽

Jianhua Sun

Keyword(s):

High Performance Computing ◽

High Performance ◽

Virtual Machines ◽

Performance Computing

Download Full-text

Exploring the Use of Virtual Machines and Virtual Clusters for High Performance Computing Education.

10.18260/1-2--17971 ◽

2020 ◽

Author(s):

Thomas Hacker

Keyword(s):

High Performance Computing ◽

High Performance ◽

Virtual Machines ◽

Computing Education ◽

Virtual Clusters ◽

Performance Computing

Download Full-text

Characterization of COVID-19’s Impact on Mobility and Short-Term Prediction of Public Transport Demand in a Mid-Size City in Spain

Sensors ◽

10.3390/s21196574 ◽

2021 ◽

Vol 21 (19) ◽

pp. 6574

Author(s):

Ana Belén Rodríguez González ◽

Mark R. Wilby ◽

Juan José Vinagre Díaz ◽

Rubén Fernández Pozo

Keyword(s):

Public Transport ◽

High Performance ◽

Monitoring Network ◽

Traffic Monitoring ◽

Health Economy ◽

Accurate Information ◽

Short Term ◽

Public And Private ◽

The Impact

COVID-19 has dramatically struck each section of our society: health, economy, employment, and mobility. This work presents a data-driven characterization of the impact of COVID-19 pandemic on public and private mobility in a mid-size city in Spain (Fuenlabrada). Our analysis used real data collected from the public transport smart card system and a Bluetooth traffic monitoring network, from February to September 2020, thus covering relevant phases of the pandemic. Our results show that, at the peak of the pandemic, public and private mobility dramatically decreased to 95% and 86% of their pre-COVID-19 values, after which the latter experienced a faster recovery. In addition, our analysis of daily patterns evidenced a clear change in the behavior of users towards mobility during the different phases of the pandemic. Based on these findings, we developed short-term predictors of future public transport demand to provide operators and mobility managers with accurate information to optimize their service and avoid crowded areas. Our prediction model achieved a high performance for pre- and post-state-of-alarm phases. Consequently, this work contributes to enlarging the knowledge about the impact of pandemic on mobility, providing a deep analysis about how it affected each transport mode in a mid-size city.

Download Full-text

Tuned Forest Fire Prediction: Static Calibration of the Evolutionary Component of ‘ESS’

CLEI electronic journal ◽

10.19153/cleiej.17.2.9 ◽

2014 ◽

Vol 17 (2) ◽

Author(s):

Germán Bianchini ◽

Paola Caymes Scutari

Keyword(s):

Forest Fires ◽

High Performance ◽

Strong Impact ◽

Statistical System ◽

Static Calibration ◽

Parameters Tuning ◽

System Output ◽

The Impact ◽

Performance Computing

Forest fires are a major risk factor with strong impact at eco-environmental and socio- economical levels, reasons why their study and modeling are very important. However, the models frequently have a certain level of uncertainty in some input parameters given that they must be approximated or estimated, as a consequence of diverse difficulties to accurately measure the conditions of the phenomenon in real time. This has resulted in the development of several methods for the uncertainty reduction, whose trade-off between accuracy and complexity can vary significantly. The system ESS (Evolutionary- Statistical System) is a method whose aim is to reduce the uncertainty, by combining Statistical Analysis, High Performance Computing (HPC) and Parallel Evolutionary Al- gorithms (PEAs). The PEAs use several parameters that require adjustment and that determine the quality of their use. The calibration of the parameters is a crucial task for reaching a good performance and to improve the system output. This paper presents an empirical study of the parameters tuning to evaluate the effectiveness of different configurations and the impact of their use in the Forest Fires prediction.

Download Full-text

Analyzing the Robustness of HPC Applications Using a Fine-Grained Soft Error Fault Injection Tool

Innovative Research and Applications in Next-Generation High Performance Computing - Advances in Systems Analysis, Software Engineering, and High Performance Computing ◽

10.4018/978-1-5225-0287-6.ch011 ◽

2016 ◽

pp. 277-305

Author(s):

Qiang Guan ◽

Nathan DeBardeleben ◽

Sean Blanchard ◽

Song Fu ◽

Claude H. Davis IV ◽

...

Keyword(s):

High Performance ◽

Fault Injection ◽

Soft Errors ◽

Small Degree ◽

Soft Error ◽

Power Efficient ◽

Fine Grained ◽

Different Characteristics ◽

The Impact ◽

Performance Computing

As the high performance computing (HPC) community continues to push towards exascale computing, HPC applications of today are only affected by soft errors to a small degree but we expect that this will become a more serious issue as HPC systems grow. We propose F-SEFI, a Fine-grained Soft Error Fault Injector, as a tool for profiling software robustness against soft errors. We utilize soft error injection to mimic the impact of errors on logic circuit behavior. Leveraging the open source virtual machine hypervisor QEMU, F-SEFI enables users to modify emulated machine instructions to introduce soft errors. F-SEFI can control what application, which sub-function, when and how to inject soft errors with different granularities, without interference to other applications that share the same environment. We demonstrate use cases of F-SEFI on several benchmark applications with different characteristics to show how data corruption can propagate to incorrect results. The findings from the fault injection campaign can be used for designing robust software and power-efficient hardware.

Download Full-text

Viability of Cloud Computing for Real-Time Numerical Weather Prediction

Weather and Forecasting ◽

10.1175/waf-d-16-0075.1 ◽

2016 ◽

Vol 31 (6) ◽

pp. 1985-1996 ◽

Cited By ~ 9

Author(s):

David Siuta ◽

Gregory West ◽

Henryk Modzelewski ◽

Roland Schigas ◽

Roland Stull

Keyword(s):

Cloud Computing ◽

High Performance Computing ◽

Real Time ◽

Numerical Weather Prediction ◽

High Performance ◽

Virtual Machines ◽

Weather Prediction ◽

Cloud Platform ◽

Numerical Weather ◽

Performance Computing

Abstract As cloud-service providers like Google, Amazon, and Microsoft decrease costs and increase performance, numerical weather prediction (NWP) in the cloud will become a reality not only for research use but for real-time use as well. The performance of the Weather Research and Forecasting (WRF) Model on the Google Cloud Platform is tested and configurations and optimizations of virtual machines that meet two main requirements of real-time NWP are found: 1) fast forecast completion (timeliness) and 2) economic cost effectiveness when compared with traditional on-premise high-performance computing hardware. Optimum performance was found by using the Intel compiler collection with no more than eight virtual CPUs per virtual machine. Using these configurations, real-time NWP on the Google Cloud Platform is found to be economically competitive when compared with the purchase of local high-performance computing hardware for NWP needs. Cloud-computing services are becoming viable alternatives to on-premise compute clusters for some applications.

Download Full-text

The impact of high-performance computing in the solution of linear systems: trends and problems

Journal of Computational and Applied Mathematics ◽

10.1016/s0377-0427(00)00401-5 ◽

2000 ◽

Vol 123 (1-2) ◽

pp. 515-530 ◽

Cited By ~ 14

Author(s):

Iain S. Duff

Keyword(s):

High Performance Computing ◽

Linear Systems ◽

High Performance ◽

The Impact ◽

Performance Computing

Download Full-text