Energy-aware job scheduler for high-performance computing

In-silico research has grown considerably. Today?s scientific code involves long-running computer simulations and hence powerful computing infrastructures are needed. Traditionally, research in high-performance computing has focused on executing code as fast as possible, while energy has been recently recognized as another goal to consider. Yet, energy-driven research has mostly focused on the hardware and middleware layers, but few efforts target the application level, where many energy-aware optimizations are possible. We revisit a catalog of Java primitives commonly used in OO scientific programming, or micro-benchmarks, to identify energy-friendly versions of the same primitive. We then apply the micro-benchmarks to classical scientific application kernels and machine learning algorithms for both single-thread and multi-thread implementations on a server. Energy usage reductions at the micro-benchmark level are substantial, while for applications obtained reductions range from 3.90% to 99.18%.

Download Full-text

Energy-Aware High-Performance Computing: Survey of State-of-the-Art Tools, Techniques, and Environments

Scientific Programming ◽

10.1155/2019/8348791 ◽

2019 ◽

Vol 2019 ◽

pp. 1-19 ◽

Cited By ~ 4

Author(s):

Pawel Czarnul ◽

Jerzy Proficz ◽

Adam Krzywaniak

Keyword(s):

High Performance Computing ◽

High Performance ◽

Hybrid Methods ◽

State Of The Art ◽

Control Methods ◽

Energy Aware ◽

Power Capping ◽

Power Limits ◽

Performance Computing

The paper presents state of the art of energy-aware high-performance computing (HPC), in particular identification and classification of approaches by system and device types, optimization metrics, and energy/power control methods. System types include single device, clusters, grids, and clouds while considered device types include CPUs, GPUs, multiprocessor, and hybrid systems. Optimization goals include various combinations of metrics such as execution time, energy consumption, and temperature with consideration of imposed power limits. Control methods include scheduling, DVFS/DFS/DCT, power capping with programmatic APIs such as Intel RAPL, NVIDIA NVML, as well as application optimizations, and hybrid methods. We discuss tools and APIs for energy/power management as well as tools and environments for prediction and/or simulation of energy/power consumption in modern HPC systems. Finally, programming examples, i.e., applications and benchmarks used in particular works are discussed. Based on our review, we identified a set of open areas and important up-to-date problems concerning methods and tools for modern HPC systems allowing energy-aware processing.

Download Full-text

Comparative Study of Runtime Systems for Energy-Aware High-Performance Computing

Handbook of Energy-Aware and Green Computing, Volume 2 ◽

10.1201/b11640-9 ◽

2013 ◽

pp. 85-106

Keyword(s):

Comparative Study ◽

High Performance Computing ◽

High Performance ◽

Runtime Systems ◽

Energy Aware ◽

Performance Computing

Download Full-text

Editorial for the special issue on Energy-aware high performance computing

Computer Science - Research and Development ◽

10.1007/s00450-015-0297-9 ◽

2015 ◽

Vol 31 (4) ◽

pp. 163-164

Author(s):

Wolfgang E. Nagel ◽

Daniel Molka ◽

Thomas Ludwig ◽

Matthias S. Müller

Keyword(s):

High Performance Computing ◽

High Performance ◽

Special Issue ◽

Energy Aware ◽

Performance Computing

Download Full-text

Energy-Aware High Performance Computing: A Taxonomy Study

2011 IEEE 17th International Conference on Parallel and Distributed Systems ◽

10.1109/icpads.2011.59 ◽

2011 ◽

Cited By ~ 14

Author(s):

Chang Cai ◽

Lizhe Wang ◽

Samee U. Khan ◽

Jie Tao

Keyword(s):

High Performance Computing ◽

High Performance ◽

Energy Aware ◽

Performance Computing

Download Full-text

Editorial for the second international conference on energy-aware high performance computing

Computer Science - Research and Development ◽

10.1007/s00450-011-0199-4 ◽

2011 ◽

Vol 27 (4) ◽

pp. 225-226

Author(s):

Thomas Ludwig ◽

Timo Minartz

Keyword(s):

High Performance Computing ◽

High Performance ◽

Energy Aware ◽

International Conference ◽

Performance Computing ◽

Second International

Download Full-text

Predicting running time of aerodynamic jobs in HPC system by combining supervised and unsupervised learning method

10.21203/rs.3.rs-360961/v1 ◽

2021 ◽

Author(s):

Hao Wang ◽

Yi-Qin Dai ◽

Jie Yu ◽

Yong Dong

Keyword(s):

Unsupervised Learning ◽

High Performance Computing ◽

Prediction Accuracy ◽

High Performance ◽

Computing Systems ◽

Running Time ◽

Supervised And Unsupervised Learning ◽

Underestimation Rate ◽

Job Scheduler ◽

Performance Computing

Abstract Improving resource utilization is an important goal of high-performance computing systems of supercomputing centers. In order to meet this goal, the job scheduler of high-performance computing systems often use backfilling scheduling to fill short-time jobs into the gaps of jobs at the front of the queue. Backfilling scheduling needs to obtain the running time of the job. In the past, the job running times are usually given by users and often far exceeded the actual running time of the job, which leads to inaccurate backfilling and a waste of computing resources. In particular, when the predicted job running time is lower than the actual time, the damage caused to the utilization of the system’s computing resources becomes more serious. Therefore, the prediction accuracy of the job running time is crucial to the utilization of system resources. The use of machine learning methods can make more accurate predictions of the job running time. Aiming at the parallel application of aerodynamics, we propose a job running time prediction framework SU combining supervised and unsupervised learning, and verifies it on the real historical data of the high-performance computing systems of China Aerodynamics Research and Development Center(CARDC). The experimental results show that SU has a high prediction accuracy(80.46%) and a low underestimation rate(24.85%).

Download Full-text

E-BaTS: Energy-Aware Scheduling for Bag-of-Task Applications in HPC Clusters

Parallel Processing Letters ◽

10.1142/s0129626415410054 ◽

2015 ◽

Vol 25 (03) ◽

pp. 1541005

Author(s):

Alexandra Vintila Filip ◽

Ana-Maria Oprescu ◽

Stefania Costache ◽

Thilo Kielmann

Keyword(s):

Energy Consumption ◽

High Performance Computing ◽

High Performance ◽

Terms Of Trade ◽

Exhaustive Search ◽

Energy Aware ◽

Trade Offs ◽

Energy Aware Scheduling ◽

And Performance ◽

Performance Computing

High-Performance Computing (HPC) systems consume large amounts of energy. As the energy consumption predictions for HPC show increasing numbers, it is important to make users aware of the energy spent for the execution of their applications. Drawing from our experience with exposing cost and performance in public clouds, in this paper we present a generic mechanism to compute fast and accurate estimates for the tradeoffs between the performance (expressed as makespan) and the energy consumption of applications running on HPC clusters. We validate our approach by implementing it in a prototype, called E-BaTS and validating it with a wide variety of HPC bags-of-tasks. Our experiments show that E-BaTS produces conservative estimates with errors below 5%, while requiring at most 12% of the energy and time of an exhaustive search for providing configurations close to the optimal ones in terms of trade-offs between energy consumption and makespan.

Download Full-text