HPC cloud services based on the Proxmox VE platform

Вычислительные технологии ◽

10.25743/ict.2019.24.6.002. ◽

2019 ◽

Author(s):

А.В. Баранов ◽

Е.А. Киселёв

Keyword(s):

High Performance Computing ◽

Management System ◽

High Performance ◽

Virtual Machines ◽

Cloud Services ◽

Job Management ◽

Job Management System ◽

High Level ◽

Performance Computing ◽

Hpc Cloud

Организация облачных сервисов для высокопроизводительных вычислений затруднена, во-первых, по причине высоких накладных расходов на виртуализацию, во-вторых, из-за специфики систем управления заданиями и ресурсами в научных суперкомпьютерных центрах. В настоящей работе рассмотрен подход к построению облачных сервисов видов PaaS и SaaS, основанных на совместном функционировании облачной платформы Proxmox VE и системы управления прохождением параллельных заданий, применяемой в качестве менеджера ресурсов в Межведомственном суперкомпьютерном центре РАН. Purpose. The purpose of this paper is to develop methods and technologies for building high-performance computing cloud services in scientific supercomputer centers. Methodology.To build a cloud environment for high-performance scientific calculations (HPC), the corresponding three-level model and the method of combining flows of supercomputer tasks of various types were applied. Results.A high-level HPC cloud services technology based on the free Proxmox VE software platform has been developed. The Proxmox VE platform has been integrated with the domestic supercomputer job management system called SUPPZ. Experimental estimates of the overheads introduced in the high-performance computing process by the Proxmox components are obtained. Findings.An approach to the integration a supercomputer job management system and a virtualization platform is proposed. The presented approach is based on the representation of the supercomputer jobs as virtual machines or containers. Using the Proxmox VE platform as an example, the influence of a virtual environment on the execution time of parallel programs is investigated experimentally. The possibility of applying the proposed approach to building cloud services of the PaaS and SaaS type in scientific supercomputing centers of collective use is substantiated for a class of applications for which the overhead costs introduced by the Proxmox components are acceptable.

Download Full-text

HPC Cloud Architecture to Reduce HPC Workflow Complexity in Containerized Environments

Applied Sciences ◽

10.3390/app11030923 ◽

2021 ◽

Vol 11 (3) ◽

pp. 923

Author(s):

Guohua Li ◽

Joon Woo ◽

Sang Boem Lim

Keyword(s):

High Performance ◽

Cloud Services ◽

Workload Management ◽

Job Management ◽

Security Issues ◽

Cloud Architecture ◽

Management Efficiency ◽

Complexity Problem ◽

And Performance ◽

Hpc Cloud

The complexity of high-performance computing (HPC) workflows is an important issue in the provision of HPC cloud services in most national supercomputing centers. This complexity problem is especially critical because it affects HPC resource scalability, management efficiency, and convenience of use. To solve this problem, while exploiting the advantage of bare-metal-level high performance, container-based cloud solutions have been developed. However, various problems still exist, such as an isolated environment between HPC and the cloud, security issues, and workload management issues. We propose an architecture that reduces this complexity by using Docker and Singularity, which are the container platforms most often used in the HPC cloud field. This HPC cloud architecture integrates both image management and job management, which are the two main elements of HPC cloud workflows. To evaluate the serviceability and performance of the proposed architecture, we developed and implemented a platform in an HPC cluster experiment. Experimental results indicated that the proposed HPC cloud architecture can reduce complexity to provide supercomputing resource scalability, high performance, user convenience, various HPC applications, and management efficiency.

Download Full-text

vCUDA: GPU accelerated high performance computing in virtual machines

2009 IEEE International Symposium on Parallel & Distributed Processing ◽

10.1109/ipdps.2009.5161020 ◽

2009 ◽

Cited By ~ 20

Author(s):

Lin Shi ◽

Hao Chen ◽

Jianhua Sun

Keyword(s):

High Performance Computing ◽

High Performance ◽

Virtual Machines ◽

Performance Computing

Download Full-text

Exploring the Use of Virtual Machines and Virtual Clusters for High Performance Computing Education.

10.18260/1-2--17971 ◽

2020 ◽

Author(s):

Thomas Hacker

Keyword(s):

High Performance Computing ◽

High Performance ◽

Virtual Machines ◽

Computing Education ◽

Virtual Clusters ◽

Performance Computing

Download Full-text

2016 Sixth International Workshop on Domain-Specific Languages and High-Level Frameworks for High Performance Computing (WOLFHPC)

10.1109/wolfhpc40351.2016 ◽

2016 ◽

Keyword(s):

High Performance Computing ◽

High Performance ◽

International Workshop ◽

Domain Specific Languages ◽

Domain Specific ◽

Sixth International Workshop ◽

High Level ◽

Performance Computing ◽

Sixth International

Download Full-text

Viability of Cloud Computing for Real-Time Numerical Weather Prediction

Weather and Forecasting ◽

10.1175/waf-d-16-0075.1 ◽

2016 ◽

Vol 31 (6) ◽

pp. 1985-1996 ◽

Cited By ~ 9

Author(s):

David Siuta ◽

Gregory West ◽

Henryk Modzelewski ◽

Roland Schigas ◽

Roland Stull

Keyword(s):

Cloud Computing ◽

High Performance Computing ◽

Real Time ◽

Numerical Weather Prediction ◽

High Performance ◽

Virtual Machines ◽

Weather Prediction ◽

Cloud Platform ◽

Numerical Weather ◽

Performance Computing

Abstract As cloud-service providers like Google, Amazon, and Microsoft decrease costs and increase performance, numerical weather prediction (NWP) in the cloud will become a reality not only for research use but for real-time use as well. The performance of the Weather Research and Forecasting (WRF) Model on the Google Cloud Platform is tested and configurations and optimizations of virtual machines that meet two main requirements of real-time NWP are found: 1) fast forecast completion (timeliness) and 2) economic cost effectiveness when compared with traditional on-premise high-performance computing hardware. Optimum performance was found by using the Intel compiler collection with no more than eight virtual CPUs per virtual machine. Using these configurations, real-time NWP on the Google Cloud Platform is found to be economically competitive when compared with the purchase of local high-performance computing hardware for NWP needs. Cloud-computing services are becoming viable alternatives to on-premise compute clusters for some applications.

Download Full-text