EAIS: Energy-aware adaptive scheduling for CNN inference on high-performance GPUs

Author(s):  
Chunrong Yao ◽  
Wantao Liu ◽  
Weiqing Tang ◽  
Songlin Hu
2019 ◽  
Vol 16 (2) ◽  
pp. 541-564
Author(s):  
Mathias Longo ◽  
Ana Rodriguez ◽  
Cristian Mateos ◽  
Alejandro Zunino

In-silico research has grown considerably. Today?s scientific code involves long-running computer simulations and hence powerful computing infrastructures are needed. Traditionally, research in high-performance computing has focused on executing code as fast as possible, while energy has been recently recognized as another goal to consider. Yet, energy-driven research has mostly focused on the hardware and middleware layers, but few efforts target the application level, where many energy-aware optimizations are possible. We revisit a catalog of Java primitives commonly used in OO scientific programming, or micro-benchmarks, to identify energy-friendly versions of the same primitive. We then apply the micro-benchmarks to classical scientific application kernels and machine learning algorithms for both single-thread and multi-thread implementations on a server. Energy usage reductions at the micro-benchmark level are substantial, while for applications obtained reductions range from 3.90% to 99.18%.


2019 ◽  
Vol 2019 ◽  
pp. 1-19 ◽  
Author(s):  
Pawel Czarnul ◽  
Jerzy Proficz ◽  
Adam Krzywaniak

The paper presents state of the art of energy-aware high-performance computing (HPC), in particular identification and classification of approaches by system and device types, optimization metrics, and energy/power control methods. System types include single device, clusters, grids, and clouds while considered device types include CPUs, GPUs, multiprocessor, and hybrid systems. Optimization goals include various combinations of metrics such as execution time, energy consumption, and temperature with consideration of imposed power limits. Control methods include scheduling, DVFS/DFS/DCT, power capping with programmatic APIs such as Intel RAPL, NVIDIA NVML, as well as application optimizations, and hybrid methods. We discuss tools and APIs for energy/power management as well as tools and environments for prediction and/or simulation of energy/power consumption in modern HPC systems. Finally, programming examples, i.e., applications and benchmarks used in particular works are discussed. Based on our review, we identified a set of open areas and important up-to-date problems concerning methods and tools for modern HPC systems allowing energy-aware processing.


Author(s):  
Yukihiro Nakagawa ◽  
Takeshi Shimizu ◽  
Takeshi Horie ◽  
Yoichi Koyanagi ◽  
Osamu Shiraki ◽  
...  

The use of virtualization technology has been increasing in the IT industry to consolidate servers and reduce power consumption significantly. Virtualized commodity servers are scaled out in the data center and increase the demand for bandwidth between servers. Therefore, a high performance switch is required. The shared-memory switch is the best performance/cost switch architecture, but it is challenging to satisfy the requirements on the memory bandwidth in a high speed network. In addition, it is challenging to handle variable-length frames in Ethernet. This chapter describes the main challenges in Ethernet switch designs and then energy-aware switch designs, including switch architecture and high speed IO interface. As implementation examples, this chapter also describes a single-chip switch Large Scale Integration (LSI) embedded with high-speed IO interfaces and 10-Gigabit Ethernet (10GbE) switch blade equipped with the switch LSI. The switch blade delivers 100% more performance per watt than other 10GbE switch blades in the industry.


2007 ◽  
Vol 1 (5) ◽  
pp. 565 ◽  
Author(s):  
E.P. Ramo ◽  
J. Resano ◽  
D. Mozos ◽  
F. Catthoor

Electronics ◽  
2021 ◽  
Vol 10 (2) ◽  
pp. 139
Author(s):  
Rosemberg Rodriguez Rodriguez Salas ◽  
Petr Dokladal ◽  
Eva Dokladalova

Convolutional Neural Network (CNNs) models’ size reduction has recently gained interest due to several advantages: energy cost reduction, embedded devices, and multi-core interfaces. One possible way to achieve model reduction is the usage of Rotation-invariant Convolutional Neural Networks because of the possibility of avoiding data augmentation techniques. In this work, we present the next step to obtain a general solution to endowing CNN architectures with the capability of classifying rotated objects and predicting the rotation angle without data-augmentation techniques. The principle consists of the concatenation of a representation mapping transforming rotation to translation and a shared weights predictor. This solution has the advantage of admitting different combinations of various basic, existing blocks. We present results obtained using a Gabor-filter bank and a ResNet feature backbone compared to previous other solutions. We also present the possibility to select between parallelizing the network in several threads for energy-aware High Performance Computing (HPC) applications or reducing the memory footprint for embedded systems. We obtain a competitive error rate on classifying rotated MNIST and outperform existing state-of-the-art results on CIFAR-10 when trained on up-right examples and validated on random orientations.


Sign in / Sign up

Export Citation Format

Share Document