Dynamic Trade-off among Fault Tolerance, Energy Consumption, and Performance on a Multiple-Issue VLIW Processor

AbstractDue to the consistent pressing quest of larger on-chip memories and caches of multicore and manycore architectures, Spin Transfer Torque Magnetic RAM (STT-MRAM or STT-RAM) has been proposed as a promising technology to replace classical SRAMs in near-future devices. Main advantages of STT-RAMs are a considerably higher transistor density and a negligible leakage power compared with SRAM technology. However, the drawback of this technology is the high probability of errors occurring especially in write operations. Such errors are asymmetric and transition-dependent, where 0 → 1 is the most critical one, and is high subjected to the amount and current (voltage) supplied to the memory during the write operation. As a consequence, STT-RAMs present an intrinsic trade-off between energy consumption vs. reliability that needs to be properly tuned w.r.t. the currently running application and its reliability requirement. This chapter proposes FlexRel, an energy-aware reliability improvement architectural scheme for STT-RAM cache memories. FlexRel considers a memory architecture provided with Error Correction Codes (ECCs) and a custom current regulator for the various cache ways and conducts a trade-off between reliability and energy consumption. FlexRel cache controller dynamically profiles the number of 0 → 1 transitions of each individual bit write operation in a cache block and based on that selects the most-suitable cache way and current level to guarantee the necessary error rate threshold (in terms of occurred write errors) while minimizing the energy consumption. We experimentally evaluated the efficiency of FlexRel against the most efficient uniform protection scheme from reliability, energy, area, and performance perspectives. Experimental simulations performed by using gem5 has demonstrated that while FlexRel satisfies the given error rate threshold, it delivers up to 13.2% energy saving. From the area footprint perspective, FlexRel delivers up to 7.9% cache ways’ area saving. Furthermore, the performance overhead of the FlexRel algorithm which changes the traffic patterns of the cache ways during the executions is 1.7%, on average.

Download Full-text

Trade-Off Between Performance, Fault Tolerance and Energy Consumption in Duplication-Based Taskgraph Scheduling

Lecture Notes in Computer Science - Architecture of Computing Systems – ARCS 2018 ◽

10.1007/978-3-319-77610-1_1 ◽

2018 ◽

pp. 3-17 ◽

Cited By ~ 1

Author(s):

Patrick Eitschberger ◽

Simon Holmbacka ◽

Jörg Keller

Keyword(s):

Energy Consumption ◽

Fault Tolerance ◽

Trade Off

Download Full-text

Investigating Energy Consumption and Performance Trade-Off for Interactive Cloud Application

IEEE Transactions on Sustainable Computing ◽

10.1109/tsusc.2017.2714959 ◽

2017 ◽

Vol 2 (2) ◽

pp. 113-126 ◽

Cited By ~ 8

Author(s):

Md Sabbir Hasan ◽

Frederico Alvares ◽

Thomas Ledoux ◽

Jean-Louis Pazat

Keyword(s):

Energy Consumption ◽

Trade Off ◽

Cloud Application ◽

And Performance

Download Full-text

Mobility-Included DNN Partition Offloading from Mobile Devices to Edge Clouds

Sensors ◽

10.3390/s21010229 ◽

2021 ◽

Vol 21 (1) ◽

pp. 229

Author(s):

Xianzhong Tian ◽

Juan Zhu ◽

Ting Xu ◽

Yanjun Li

Keyword(s):

Neural Networks ◽

Energy Consumption ◽

Mobile Devices ◽

Wireless Network ◽

Deep Neural Networks ◽

Mobile User ◽

Computation Offloading ◽

Long Latency ◽

Total Latency ◽

And Performance

The latest results in Deep Neural Networks (DNNs) have greatly improved the accuracy and performance of a variety of intelligent applications. However, running such computation-intensive DNN-based applications on resource-constrained mobile devices definitely leads to long latency and huge energy consumption. The traditional way is performing DNNs in the central cloud, but it requires significant amounts of data to be transferred to the cloud over the wireless network and also results in long latency. To solve this problem, offloading partial DNN computation to edge clouds has been proposed, to realize the collaborative execution between mobile devices and edge clouds. In addition, the mobility of mobile devices is easily to cause the computation offloading failure. In this paper, we develop a mobility-included DNN partition offloading algorithm (MDPO) to adapt to user’s mobility. The objective of MDPO is minimizing the total latency of completing a DNN job when the mobile user is moving. The MDPO algorithm is suitable for both DNNs with chain topology and graphic topology. We evaluate the performance of our proposed MDPO compared to local-only execution and edge-only execution, experiments show that MDPO significantly reduces the total latency and improves the performance of DNN, and MDPO can adjust well to different network conditions.

Download Full-text

GRAM

ACM Transactions on Architecture and Code Optimization ◽

10.1145/3441830 ◽

2021 ◽

Vol 18 (2) ◽

pp. 1-24

Author(s):

Nhut-Minh Ho ◽

Himeshi De silva ◽

Weng-Fai Wong

Keyword(s):

Performance Improvement ◽

Trade Off ◽

Accuracy Requirement ◽

Output Error ◽

Fine Grain ◽

Mixed Precision ◽

And Performance ◽

Effective Use

This article presents GRAM (<underline>G</underline>PU-based <underline>R</underline>untime <underline>A</underline>daption for <underline>M</underline>ixed-precision) a framework for the effective use of mixed precision arithmetic for CUDA programs. Our method provides a fine-grain tradeoff between output error and performance. It can create many variants that satisfy different accuracy requirements by assigning different groups of threads to different precision levels adaptively at runtime . To widen the range of applications that can benefit from its approximation, GRAM comes with an optional half-precision approximate math library. Using GRAM, we can trade off precision for any performance improvement of up to 540%, depending on the application and accuracy requirement.

Download Full-text

Power and Performance Evaluation of Memory-Intensive Applications

Energies ◽

10.3390/en14144089 ◽

2021 ◽

Vol 14 (14) ◽

pp. 4089

Author(s):

Kaiqiang Zhang ◽

Dongyang Ou ◽

Congfeng Jiang ◽

Yeliang Qiu ◽

Longchuan Yan

Keyword(s):

Energy Efficiency ◽

Energy Consumption ◽

Power Consumption ◽

Job Scheduling ◽

Memory System ◽

Processor Core ◽

Memory Efficiency ◽

And Performance ◽

Reasonable Use ◽

Server System

In terms of power and energy consumption, DRAMs play a key role in a modern server system as well as processors. Although power-aware scheduling is based on the proportion of energy between DRAM and other components, when running memory-intensive applications, the energy consumption of the whole server system will be significantly affected by the non-energy proportion of DRAM. Furthermore, modern servers usually use NUMA architecture to replace the original SMP architecture to increase its memory bandwidth. It is of great significance to study the energy efficiency of these two different memory architectures. Therefore, in order to explore the power consumption characteristics of servers under memory-intensive workload, this paper evaluates the power consumption and performance of memory-intensive applications in different generations of real rack servers. Through analysis, we find that: (1) Workload intensity and concurrent execution threads affects server power consumption, but a fully utilized memory system may not necessarily bring good energy efficiency indicators. (2) Even if the memory system is not fully utilized, the memory capacity of each processor core has a significant impact on application performance and server power consumption. (3) When running memory-intensive applications, memory utilization is not always a good indicator of server power consumption. (4) The reasonable use of the NUMA architecture will improve the memory energy efficiency significantly. The experimental results show that reasonable use of NUMA architecture can improve memory efficiency by 16% compared with SMP architecture, while unreasonable use of NUMA architecture reduces memory efficiency by 13%. The findings we present in this paper provide useful insights and guidance for system designers and data center operators to help them in energy-efficiency-aware job scheduling and energy conservation.

Download Full-text

Selecting packaging material for dry food products by trade‐off of sustainability and performance: A case study on cookies and milk powder

Packaging Technology and Science ◽

10.1002/pts.2561 ◽

2021 ◽

Vol 34 (5) ◽

pp. 303-318

Author(s):

Maarten Baele ◽

An Vermeulen ◽

Dimitri Adons ◽

Roos Peeters ◽

Angelique Vandemoortele ◽

...

Keyword(s):

Milk Powder ◽

Food Products ◽

Packaging Material ◽

Trade Off ◽

Dry Food ◽

And Performance

Download Full-text

A Trade-Off between Computing Power and Energy Consumption of On-Board Data Processing in GPU Accelerated In-Orbit Space Systems

TRANSACTIONS OF THE JAPAN SOCIETY FOR AERONAUTICAL AND SPACE SCIENCES AEROSPACE TECHNOLOGY JAPAN ◽

10.2322/tastj.19.700 ◽

2021 ◽

Vol 19 (5) ◽

pp. 700-708

Author(s):

Nandinbaatar TSOG ◽

Saad MUBEEN ◽

Mikael SJÖDIN ◽

Fredrik BRUHN

Keyword(s):

Energy Consumption ◽

Data Processing ◽

Orbit Space ◽

Space Systems ◽

Computing Power ◽

Trade Off ◽

Power And Energy

Download Full-text

Age and Performance Under Pressure

The Oxford Handbook of Sports Economics ◽

10.1093/oxfordhb/9780195387780.013.0008 ◽

2012 ◽

pp. 134-152 ◽

Cited By ~ 1

Author(s):

Harold O. Fried ◽

Loren W. Tauer

Keyword(s):

Correlation Coefficient ◽

High Performance ◽

Age Distribution ◽

Mental Ability ◽

Trade Off ◽

Athletic Ability ◽

Front End ◽

Ladies Professional Golf Association ◽

And Performance ◽

Performance Under Pressure

This article explores how well an individual manages his or her own talent to achieve high performance in an individual sport. Its setting is the Ladies Professional Golf Association (LPGA). The order-m approach is explained. Additionally, the data and the empirical findings are presented. The inputs measure fundamental golfing athletic ability. The output measures success on the LPGA tour. The correlation coefficient between earnings per event and the ability to perform under pressure is 0.48. The careers of golfers occur on the front end of the age distribution. There is a classic trade-off between the inevitable deterioration in the mental ability to handle the pressure and experience gained with time. The ability to perform under pressure peaks at age 37.

Download Full-text