Estimating the overhead and coupling of scientific computing clusters

Since simulation became the third pillar of scientific research, several forms of computers have become available to drive computer aided simulations, and nowadays, clusters are the most popular type of computers supporting these tasks. For instance, cluster settings, such as the so-called supercomputers, cluster of workstations (COW), cluster of desktops (COD), and cluster of virtual machines (COV) have been considered in literature to embrace a variety of scientific applications. However, those scientific applications categorized as high-performance computing (HPC) are conceptually restricted to be addressed only by supercomputers. In this aspect, we introduce the notions of cluster overhead and cluster coupling to assess the capacity of non-HPC systems to handle HPC applications. We also compare the cluster overhead with an existing measure of overhead in computing systems, the total parallel overhead, to explain the correctness of our methodology. The evaluation of capacity considers the seven dwarfs of scientific computing, which are well-known, scientific computing building blocks considered in the development of HPC applications. The evaluation of these building blocks provides insights regarding the strengths and weaknesses of non-HPC systems to deal with future HPC applications developed with one or a combination of these algorithmic building blocks.

Predicting Runtime in HPC Environments for an Efficient Use of Computational Resources

10.5753/wscad.2021.18513 ◽

2021 ◽

Author(s):

Mariza Ferro ◽

Vinicius P. Klôh ◽

Matheus Gritz ◽

Vitor de Sá ◽

Bruno Schulze

Keyword(s):

Neural Network ◽

Machine Learning ◽

Linear Regression ◽

Decision Tree ◽

High Performance ◽

Performance Metrics ◽

Scientific Applications ◽

Computing Systems ◽

Computational Resources ◽

Understanding the computational impact of scientific applications on computational architectures through runtime should guide the use of computational resources in high-performance computing systems. In this work, we propose an analysis of Machine Learning (ML) algorithms to gather knowledge about the performance of these applications through hardware events and derived performance metrics. Nine NAS benchmarks were executed and the hardware events were collected. These experimental results were used to train a Neural Network, a Decision Tree Regressor and a Linear Regression focusing on predicting the runtime of scientific applications according to the performance metrics.

An approach for realistically simulating the performance of scientific applications on high performance computing systems

Future Generation Computer Systems ◽

10.1016/j.future.2019.10.007 ◽

2020 ◽

Vol 111 ◽

pp. 617-633 ◽

Cited By ~ 2

Author(s):

Ali Mohammed ◽

Ahmed Eleliemy ◽

Florina M. Ciorba ◽

Franziska Kasielke ◽

Ioana Banicescu

Keyword(s):

High Performance ◽

Scientific Applications ◽

Computing Systems ◽

On the trade‐off of mixing scientific applications on capacity high‐performance computing systems

IET Computers & Digital Techniques ◽

10.1049/iet-cdt.2012.0059 ◽

2013 ◽

Vol 7 (2) ◽

pp. 81-92 ◽

Cited By ~ 2

Author(s):

Ana Jokanovic ◽

Jose Carlos Sancho ◽

German Rodriguez ◽

Cyriel Minkenberg ◽

Ramon Beivide ◽

...

Keyword(s):

High Performance ◽

Scientific Applications ◽

Computing Systems ◽

Trade Off ◽

FusionFS: Toward supporting data-intensive scientific applications on extreme-scale high-performance computing systems

2014 IEEE International Conference on Big Data (Big Data) ◽

10.1109/bigdata.2014.7004214 ◽

2014 ◽

Cited By ~ 55

Author(s):

Dongfang Zhao ◽

Zhao Zhang ◽

Xiaobing Zhou ◽

Tonglin Li ◽

Ke Wang ◽

...

Keyword(s):

High Performance ◽

Scientific Applications ◽

Computing Systems ◽

Data Intensive ◽

Extreme Scale ◽

HPC-VMs: Virtual machines in high performance computing systems

2012 IEEE Conference on High Performance Extreme Computing ◽

10.1109/hpec.2012.6408668 ◽

2012 ◽

Cited By ~ 13

Author(s):

Albert Reuther ◽

Peter Michaleas ◽

Andrew Prout ◽

Jeremy Kepner

Keyword(s):

High Performance ◽

Virtual Machines ◽

Computing Systems ◽

Editorial for the third international conference on energy-aware high performance computing

Computer Science - Research and Development ◽

10.1007/s00450-012-0231-3 ◽

2012 ◽

Vol 29 (2) ◽

pp. 95-96

Author(s):

Timo Minartz ◽

Thomas Ludwig

Keyword(s):

High Performance ◽

Energy Aware ◽

International Conference ◽

The Third ◽

MonSTer: An Out-of-the-Box Monitoring Tool for High Performance Computing Systems

2020 IEEE International Conference on Cluster Computing (CLUSTER) ◽

10.1109/cluster49012.2020.00022 ◽

2020 ◽

Author(s):

Jie Li ◽

Ghazanfar Ali ◽

Ngan Nguyen ◽

Jon Hass ◽

Alan Sill ◽

...

Keyword(s):

High Performance ◽

Monitoring Tool ◽

Computing Systems ◽

Session details: Special issue on the 1st international workshop on performance modeling, benchmarking and simulation of high performance computing systems (PMBS 10)

ACM SIGMETRICS Performance Evaluation Review ◽

10.1145/3263957 ◽

2011 ◽

Vol 38 (4) ◽

Keyword(s):

High Performance ◽

Performance Modeling ◽

International Workshop ◽

Special Issue ◽

Computing Systems ◽

Treasure Hunt Framework: Distributing Metaheuristics on High Performance Computing Systems

Swarm and Evolutionary Computation ◽

10.1016/j.swevo.2021.100906 ◽

2021 ◽

pp. 100906

Author(s):

Peter Frank Perroni ◽

Myriam Regattieri Delgado ◽

Daniel Weingaertner

Keyword(s):

High Performance ◽

Computing Systems ◽

GPU-accelerated molecular dynamics: State-of-art software performance and porting from Nvidia CUDA to AMD HIP

The International Journal of High Performance Computing Applications ◽

10.1177/10943420211008288 ◽

2021 ◽

pp. 109434202110082

Author(s):

Nikolay Kondratyuk ◽

Vsevolod Nikolskiy ◽

Daniil Pavlov ◽

Vladimir Stegailov

Keyword(s):

Molecular Dynamics ◽

High Performance ◽

Software Performance ◽

Computing Systems ◽

Accelerated Molecular Dynamics ◽

Nvidia Cuda ◽

Software And Hardware ◽

Management Capabilities ◽

Utilization Time ◽

Classical molecular dynamics (MD) calculations represent a significant part of the utilization time of high-performance computing systems. As usual, the efficiency of such calculations is based on an interplay of software and hardware that are nowadays moving to hybrid GPU-based technologies. Several well-developed open-source MD codes focused on GPUs differ both in their data management capabilities and in performance. In this work, we analyze the performance of LAMMPS, GROMACS and OpenMM MD packages with different GPU backends on Nvidia Volta and AMD Vega20 GPUs. We consider the efficiency of solving two identical MD models (generic for material science and biomolecular studies) using different software and hardware combinations. We describe our experience in porting the CUDA backend of LAMMPS to ROCm HIP that shows considerable benefits for AMD GPUs comparatively to the OpenCL backend.