scholarly journals Performance characterization of containerization for HPC workloads on InfiniBand clusters: an empirical study

2021 ◽  
Author(s):  
Peini Liu ◽  
Jordi Guitart

AbstractContainerization technology offers an appealing alternative for encapsulating and operating applications (and all their dependencies) without being constrained by the performance penalties of using Virtual Machines and, as a result, has got the interest of the High-Performance Computing (HPC) community to obtain fast, customized, portable, flexible, and reproducible deployments of their workloads. Previous work on this area has demonstrated that containerized HPC applications can exploit InfiniBand networks, but has ignored the potential of multi-container deployments which partition the processes that belong to each application into multiple containers in each host. Partitioning HPC applications has demonstrated to be useful when using virtual machines by constraining them to a single NUMA (Non-Uniform Memory Access) domain. This paper conducts a systematical study on the performance of multi-container deployments with different network fabrics and protocols, focusing especially on Infiniband networks. We analyze the impact of container granularity and its potential to exploit processor and memory affinity to improve applications’ performance. Our results show that default Singularity can achieve near bare-metal performance but does not support fine-grain multi-container deployments. Docker and Singularity-instance have similar behavior in terms of the performance of deployment schemes with different container granularity and affinity. This behavior differs for the several network fabrics and protocols, and depends as well on the application communication patterns and the message size. Moreover, deployments on Infiniband are also more impacted by the computation and memory allocation, and because of that, they can exploit the affinity better.

Author(s):  
Ouidad Achahbar ◽  
Mohamed Riduan Abid

The ongoing pervasiveness of Internet access is intensively increasing Big Data production. This, in turn, increases demand on compute power to process this massive data, and thus rendering High Performance Computing (HPC) into a high solicited service. Based on the paradigm of providing computing as a utility, the Cloud is offering user-friendly infrastructures for processing Big Data, e.g., High Performance Computing as a Service (HPCaaS). Still, HPCaaS performance is tightly coupled with the underlying virtualization technique since the latter is responsible for the creation of virtual machines that carry out data processing jobs. In this paper, the authors evaluate the impact of virtualization on HPCaaS. They track HPC performance under different Cloud virtualization platforms, namely KVM and VMware-ESXi, and compare it against physical clusters. Each tested cluster provided different performance trends. Yet, the overall analysis of the findings proved that the selection of virtualization technology can lead to significant improvements when handling HPCaaS.


Author(s):  
Antonio Tadeu Gomes ◽  
Enzo Molion ◽  
Roberto Pinto Souto ◽  
Jean François Méhaut

A memory allocation anomaly occurs when the allocation of a set of heap blocks imposes an unnecessary overhead on the execution of an application. In this paper, we propose a method for identifying, locating, characterizing and fixing allocation anomalies, and a tool for developers to apply the method. We experiment our method and tool with a numerical simulator aimed at approximating the solutions to partial differential equations using a finite element method. We show that taming allocation anomalies in this simulator reduces the memory footprint of its processes by 37.27% and the execution time by 16.52%. We conclude that the developer of high-performance computing applications can benefit from the method and tool during the software development cycle.


Big Data ◽  
2016 ◽  
pp. 1687-1704
Author(s):  
Ouidad Achahbar ◽  
Mohamed Riduan Abid

The ongoing pervasiveness of Internet access is intensively increasing Big Data production. This, in turn, increases demand on compute power to process this massive data, and thus rendering High Performance Computing (HPC) into a high solicited service. Based on the paradigm of providing computing as a utility, the Cloud is offering user-friendly infrastructures for processing Big Data, e.g., High Performance Computing as a Service (HPCaaS). Still, HPCaaS performance is tightly coupled with the underlying virtualization technique since the latter is responsible for the creation of virtual machines that carry out data processing jobs. In this paper, the authors evaluate the impact of virtualization on HPCaaS. They track HPC performance under different Cloud virtualization platforms, namely KVM and VMware-ESXi, and compare it against physical clusters. Each tested cluster provided different performance trends. Yet, the overall analysis of the findings proved that the selection of virtualization technology can lead to significant improvements when handling HPCaaS.


Sensors ◽  
2021 ◽  
Vol 21 (19) ◽  
pp. 6574
Author(s):  
Ana Belén Rodríguez González ◽  
Mark R. Wilby ◽  
Juan José Vinagre Díaz ◽  
Rubén Fernández Pozo

COVID-19 has dramatically struck each section of our society: health, economy, employment, and mobility. This work presents a data-driven characterization of the impact of COVID-19 pandemic on public and private mobility in a mid-size city in Spain (Fuenlabrada). Our analysis used real data collected from the public transport smart card system and a Bluetooth traffic monitoring network, from February to September 2020, thus covering relevant phases of the pandemic. Our results show that, at the peak of the pandemic, public and private mobility dramatically decreased to 95% and 86% of their pre-COVID-19 values, after which the latter experienced a faster recovery. In addition, our analysis of daily patterns evidenced a clear change in the behavior of users towards mobility during the different phases of the pandemic. Based on these findings, we developed short-term predictors of future public transport demand to provide operators and mobility managers with accurate information to optimize their service and avoid crowded areas. Our prediction model achieved a high performance for pre- and post-state-of-alarm phases. Consequently, this work contributes to enlarging the knowledge about the impact of pandemic on mobility, providing a deep analysis about how it affected each transport mode in a mid-size city.


2014 ◽  
Vol 17 (2) ◽  
Author(s):  
Germán Bianchini ◽  
Paola Caymes Scutari

Forest fires are a major risk factor with strong impact at eco-environmental and socio- economical levels, reasons why their study and modeling are very important. However, the models frequently have a certain level of uncertainty in some input parameters given that they must be approximated or estimated, as a consequence of diverse difficulties to accurately measure the conditions of the phenomenon in real time. This has resulted in the development of several methods for the uncertainty reduction, whose trade-off between accuracy and complexity can vary significantly. The system ESS (Evolutionary- Statistical System) is a method whose aim is to reduce the uncertainty, by combining Statistical Analysis, High Performance Computing (HPC) and Parallel Evolutionary Al- gorithms (PEAs). The PEAs use several parameters that require adjustment and that determine the quality of their use. The calibration of the parameters is a crucial task for reaching a good performance and to improve the system output. This paper presents an empirical study of the parameters tuning to evaluate the effectiveness of different configurations and the impact of their use in the Forest Fires prediction.


Author(s):  
Qiang Guan ◽  
Nathan DeBardeleben ◽  
Sean Blanchard ◽  
Song Fu ◽  
Claude H. Davis IV ◽  
...  

As the high performance computing (HPC) community continues to push towards exascale computing, HPC applications of today are only affected by soft errors to a small degree but we expect that this will become a more serious issue as HPC systems grow. We propose F-SEFI, a Fine-grained Soft Error Fault Injector, as a tool for profiling software robustness against soft errors. We utilize soft error injection to mimic the impact of errors on logic circuit behavior. Leveraging the open source virtual machine hypervisor QEMU, F-SEFI enables users to modify emulated machine instructions to introduce soft errors. F-SEFI can control what application, which sub-function, when and how to inject soft errors with different granularities, without interference to other applications that share the same environment. We demonstrate use cases of F-SEFI on several benchmark applications with different characteristics to show how data corruption can propagate to incorrect results. The findings from the fault injection campaign can be used for designing robust software and power-efficient hardware.


2016 ◽  
Vol 31 (6) ◽  
pp. 1985-1996 ◽  
Author(s):  
David Siuta ◽  
Gregory West ◽  
Henryk Modzelewski ◽  
Roland Schigas ◽  
Roland Stull

Abstract As cloud-service providers like Google, Amazon, and Microsoft decrease costs and increase performance, numerical weather prediction (NWP) in the cloud will become a reality not only for research use but for real-time use as well. The performance of the Weather Research and Forecasting (WRF) Model on the Google Cloud Platform is tested and configurations and optimizations of virtual machines that meet two main requirements of real-time NWP are found: 1) fast forecast completion (timeliness) and 2) economic cost effectiveness when compared with traditional on-premise high-performance computing hardware. Optimum performance was found by using the Intel compiler collection with no more than eight virtual CPUs per virtual machine. Using these configurations, real-time NWP on the Google Cloud Platform is found to be economically competitive when compared with the purchase of local high-performance computing hardware for NWP needs. Cloud-computing services are becoming viable alternatives to on-premise compute clusters for some applications.


Sign in / Sign up

Export Citation Format

Share Document