Aggressive and reliable high-performance architectures - techniques for thermal control, energy efficiency, and performance augmentation

2017 ◽

pp. 47-73

Author(s):

Miguel Bordallo López

Keyword(s):

Computer Vision ◽

Energy Efficiency ◽

User Interfaces ◽

High Performance ◽

Heterogeneous Computing ◽

Low Power Consumption ◽

Mobile Environment ◽

Trade Off ◽

And Performance ◽

Important Design

Computer vision can be used to increase the interactivity of existing and new camera-based applications. It can be used to build novel interaction methods and user interfaces. The computing and sensing needs of this kind of applications require a careful balance between quality and performance, a practical trade-off. This chapter shows the importance of using all the available resources to hide application latency and maximize computational throughput. The experience gained during the developing of interactive applications is utilized to characterize the constraints imposed by the mobile environment, discussing the most important design goals: high performance and low power consumption. In addition, this chapter discusses the use of heterogeneous computing via asymmetric multiprocessing to improve the throughput and energy efficiency of interactive vision-based applications.

Download Full-text

Tools and Methods for Measuring and Tuning the Energy Efficiency of HPC Systems

Scientific Programming ◽

10.1155/2014/657302 ◽

2014 ◽

Vol 22 (4) ◽

pp. 273-283 ◽

Cited By ~ 6

Author(s):

Robert Schöne ◽

Jan Treibig ◽

Manuel F. Dolz ◽

Carla Guillen ◽

Carmen Navarrete ◽

...

Keyword(s):

Energy Efficiency ◽

High Performance ◽

Optimal Operation ◽

Measurement Tools ◽

Measurement And Analysis ◽

Power Limit ◽

And Performance ◽

Power And Energy ◽

Automated Tuning ◽

Performance Computing

Energy costs nowadays represent a significant share of the total costs of ownership of High Performance Computing (HPC) systems. In this paper we provide an overview on different aspects of energy efficiency measurement and optimization. This includes metrics that define energy efficiency and a description of common power and energy measurement tools. We discuss performance measurement and analysis suites that use these tools and provide users the possibility to analyze energy efficiency weaknesses in their code. We also demonstrate how the obtained power and performance data can be used to locate inefficient resource usage or to create a model to predict optimal operation points. We further present interfaces in these suites that allow an automated tuning for energy efficiency and how these interfaces are used. We finally discuss how a hard power limit will change our view on energy efficient HPC in the future.

Download Full-text

Towards an Autonomous Framework for HPC Optimization: Using Machine Learning for Energy and Performance Modeling

10.5753/wscad.2019.8689 ◽

2019 ◽

Author(s):

Vinícius Klôh ◽

Matheus Gritz ◽

Bruno Schulze ◽

Mariza Ferro

Keyword(s):

Machine Learning ◽

Energy Efficiency ◽

Energy Consumption ◽

Energy Saving ◽

High Performance ◽

Performance Modeling ◽

Application Performance ◽

Ongoing Research ◽

And Performance

Performance and energy efficiency are now critical concerns in high performance scientific computing. It is expected that requirements of the scientific problem should guide the orchestration of different techniques of energy saving, in order to improve the balance between energy consumption and application performance. To enable this balance, we propose the development of an autonomous framework to make this orchestration and present the ongoing research to this development, more specifically, focusing in the characterization of the scientific applications and the performance modeling tasks using Machine Learning.

Download Full-text

Trace nanoanalysis

Proceedings, annual meeting, Electron Microscopy Society of America ◽

10.1017/s0424820100172437 ◽

1994 ◽

Vol 52 ◽

pp. 940-941

Author(s):

D. E. Newbury ◽

R. D. Leapman

Keyword(s):

High Performance ◽

Secondary Ion Mass Spectrometry ◽

Detection Efficiency ◽

Volume Element ◽

And Performance ◽

Trace Constituents ◽

Secondary Ion ◽

Concentration Levels ◽

Structure Properties

Trace constituents, which can be very loosely defined as those present at concentration levels below 1 percent, often exert influence on structure, properties, and performance far greater than what might be estimated from their proportion alone. Defining the role of trace constituents in the microstructure, or indeed even determining their location, makes great demands on the available array of microanalytical tools. These demands become increasingly more challenging as the dimensions of the volume element to be probed become smaller. For example, a cubic volume element of silicon with an edge dimension of 1 micrometer contains approximately 5×1010 atoms. High performance secondary ion mass spectrometry (SIMS) can be used to measure trace constituents to levels of hundreds of parts per billion from such a volume element (e. g., detection of at least 100 atoms to give 10% reproducibility with an overall detection efficiency of 1%, considering ionization, transmission, and counting).

Download Full-text

https://imsciences.edu.pk/files/journals/vol12_2/New%201%20MA.864.pdf

Business & Economic Review ◽

10.22547/ber/12.2.2 ◽

2020 ◽

Vol 12 (2) ◽

pp. 19-50 ◽

Cited By ~ 1

Author(s):

Muhammad Siddique ◽

Shandana Shoaib ◽

Zahoor Jan

Keyword(s):

Organizational Performance ◽

Structural Equation ◽

High Performance ◽

Service Sector ◽

Performance Outcomes ◽

Theory And Practice ◽

Relational Coordination ◽

Multiple Sources ◽

And Performance ◽

High Level

A key aspect of work processes in service sector firms is the interconnection between tasks and performance. Relational coordination can play an important role in addressing the issues of coordinating organizational activities due to high level of interdependence complexity in service sector firms. Research has primarily supported the aspect that well devised high performance work systems (HPWS) can intensify organizational performance. There is a growing debate, however, with regard to understanding the “mechanism” linking HPWS and performance outcomes. Using relational coordination theory, this study examines a model that examine the effects of subsets of HPWS, such as motivation, skills and opportunity enhancing HR practices on relational coordination among employees working in reciprocal interdependent job settings. Data were gathered from multiple sources including managers and employees at individual, functional and unit levels to know their understanding in relation to HPWS and relational coordination (RC) in 218 bank branches in Pakistan. Data analysis via structural equation modelling, results suggest that HPWS predicted RC among officers at the unit level. The findings of the study have contributions to both, theory and practice.

Download Full-text

Statistical and machine learning models for optimizing energy in parallel applications

The International Journal of High Performance Computing Applications ◽

10.1177/1094342019842915 ◽

2019 ◽

Vol 33 (6) ◽

pp. 1079-1097 ◽

Cited By ~ 2

Author(s):

Mark Endrei ◽

Chao Jin ◽

Minh Ngoc Dinh ◽

David Abramson ◽

Heidi Poxon ◽

...

Keyword(s):

Machine Learning ◽

Energy Efficiency ◽

High Performance ◽

Large Scale ◽

Energy Use ◽

Parallel Applications ◽

Learning Models ◽

Trade Off ◽

Time Required ◽

Machine Learning Models

Rising power costs and constraints are driving a growing focus on the energy efficiency of high performance computing systems. The unique characteristics of a particular system and workload and their effect on performance and energy efficiency are typically difficult for application users to assess and to control. Settings for optimum performance and energy efficiency can also diverge, so we need to identify trade-off options that guide a suitable balance between energy use and performance. We present statistical and machine learning models that only require a small number of runs to make accurate Pareto-optimal trade-off predictions using parameters that users can control. We study model training and validation using several parallel kernels and more complex workloads, including Algebraic Multigrid (AMG), Large-scale Atomic Molecular Massively Parallel Simulator, and Livermore Unstructured Lagrangian Explicit Shock Hydrodynamics. We demonstrate that we can train the models using as few as 12 runs, with prediction error of less than 10%. Our AMG results identify trade-off options that provide up to 45% improvement in energy efficiency for around 10% performance loss. We reduce the sample measurement time required for AMG by 90%, from 13 h to 74 min.

Download Full-text

Study on strong spinning preparation method and mechanism of the industrial pure titanium ultrafine crystallization

Journal of Engineered Fibers and Fabrics ◽

10.1177/1558925019895256 ◽

2019 ◽

Vol 14 ◽

pp. 155892501989525

Author(s):

Yu Yang ◽

Yanyan Jia

Keyword(s):

Heat Treatment ◽

High Performance ◽

Pure Titanium ◽

Grain Structure ◽

Theoretical Research ◽

Ultrafine Grain ◽

Forming Process ◽

Spinning Process ◽

And Performance ◽

Material Utilization

Ultrafine crystallization of industrial pure titanium allowed for higher tensile strength, corrosion resistance, and thermal stability and is therefore widely used in medical instrumentation, aerospace, and passenger vehicle manufacturing. However, the ultrafine crystallizing batch preparation of tubular industrial pure titanium is limited by the development of the spinning process and has remained at the theoretical research stage. In this article, the tubular TA2 industrial pure titanium was taken as the research object, and the ultrafine crystal forming process based on “5-pass strong spin-heat treatment-3 pass-spreading-heat treatment” was proposed. Based on the spinning process test, the ultimate thinning rate of the method is explored and the evolution of the surface microstructure was analyzed by metallographic microscope. The research suggests that the multi-pass, medium–small, and thinning amount of spinning causes the grain structure to be elongated in the axial and tangential directions, and then refined, and the axial fiber uniformity is improved. The research results have certain scientific significance for reducing the consumption of high-performance metals improving material utilization and performance, which also promote the development of ultrafine-grain metals’ preparation technology.

Download Full-text

Overview of typical application energy efficiency optimization in high-performance data centers

2021 IEEE International Conference on Power Electronics, Computer Applications (ICPECA) ◽

10.1109/icpeca51329.2021.9362524 ◽

2021 ◽

Author(s):

Weidong Wu ◽

Haiyang Chen ◽

Kuanhong Li ◽

Jun Yu

Keyword(s):

Energy Efficiency ◽

High Performance ◽

Data Centers ◽

Performance Data ◽

Typical Application ◽

Efficiency Optimization

Download Full-text

Power and Performance Evaluation of Memory-Intensive Applications

Energies ◽

10.3390/en14144089 ◽

2021 ◽

Vol 14 (14) ◽

pp. 4089

Author(s):

Kaiqiang Zhang ◽

Dongyang Ou ◽

Congfeng Jiang ◽

Yeliang Qiu ◽

Longchuan Yan

Keyword(s):

Energy Efficiency ◽

Energy Consumption ◽

Power Consumption ◽

Job Scheduling ◽

Memory System ◽

Processor Core ◽

Memory Efficiency ◽

And Performance ◽

Reasonable Use ◽

Server System

In terms of power and energy consumption, DRAMs play a key role in a modern server system as well as processors. Although power-aware scheduling is based on the proportion of energy between DRAM and other components, when running memory-intensive applications, the energy consumption of the whole server system will be significantly affected by the non-energy proportion of DRAM. Furthermore, modern servers usually use NUMA architecture to replace the original SMP architecture to increase its memory bandwidth. It is of great significance to study the energy efficiency of these two different memory architectures. Therefore, in order to explore the power consumption characteristics of servers under memory-intensive workload, this paper evaluates the power consumption and performance of memory-intensive applications in different generations of real rack servers. Through analysis, we find that: (1) Workload intensity and concurrent execution threads affects server power consumption, but a fully utilized memory system may not necessarily bring good energy efficiency indicators. (2) Even if the memory system is not fully utilized, the memory capacity of each processor core has a significant impact on application performance and server power consumption. (3) When running memory-intensive applications, memory utilization is not always a good indicator of server power consumption. (4) The reasonable use of the NUMA architecture will improve the memory energy efficiency significantly. The experimental results show that reasonable use of NUMA architecture can improve memory efficiency by 16% compared with SMP architecture, while unreasonable use of NUMA architecture reduces memory efficiency by 13%. The findings we present in this paper provide useful insights and guidance for system designers and data center operators to help them in energy-efficiency-aware job scheduling and energy conservation.

Download Full-text

High-Performance Image Filters via Sparse Approximations

Proceedings of the ACM on Computer Graphics and Interactive Techniques ◽

10.1145/3406182 ◽

2020 ◽

Vol 3 (2) ◽

pp. 1-19

Author(s):

Kersten Schuster ◽

Philip Trettner ◽

Leif Kobbelt

Keyword(s):

High Performance ◽

Hardware Acceleration ◽

Optimization Method ◽

Translation Invariant ◽

Approximation Quality ◽

Trade Offs ◽

Sparse Approximations ◽

Image Filters ◽

Good Trade ◽

And Performance

We present a numerical optimization method to find highly efficient (sparse) approximations for convolutional image filters. Using a modified parallel tempering approach, we solve a constrained optimization that maximizes approximation quality while strictly staying within a user-prescribed performance budget. The results are multi-pass filters where each pass computes a weighted sum of bilinearly interpolated sparse image samples, exploiting hardware acceleration on the GPU. We systematically decompose the target filter into a series of sparse convolutions, trying to find good trade-offs between approximation quality and performance. Since our sparse filters are linear and translation-invariant, they do not exhibit the aliasing and temporal coherence issues that often appear in filters working on image pyramids. We show several applications, ranging from simple Gaussian or box blurs to the emulation of sophisticated Bokeh effects with user-provided masks. Our filters achieve high performance as well as high quality, often providing significant speed-up at acceptable quality even for separable filters. The optimized filters can be baked into shaders and used as a drop-in replacement for filtering tasks in image processing or rendering pipelines.

Download Full-text

Aggressive and reliable high-performance architectures - techniques for thermal control, energy efficiency, and performance augmentation

Mobile Platform Challenges in Interactive Computer Vision

Tools and Methods for Measuring and Tuning the Energy Efficiency of HPC Systems

Towards an Autonomous Framework for HPC Optimization: Using Machine Learning for Energy and Performance Modeling

Trace nanoanalysis

https://imsciences.edu.pk/files/journals/vol12_2/New%201%20MA.864.pdf

Statistical and machine learning models for optimizing energy in parallel applications

Study on strong spinning preparation method and mechanism of the industrial pure titanium ultrafine crystallization

Overview of typical application energy efficiency optimization in high-performance data centers

Power and Performance Evaluation of Memory-Intensive Applications

High-Performance Image Filters via Sparse Approximations

Export Citation Format