Dynamic Trade-off among Fault Tolerance, Energy Consumption, and Performance on a Multiple-Issue VLIW Processor

2018 ◽  
Vol 4 (3) ◽  
pp. 327-339 ◽  
Author(s):  
Anderson L. Sartor ◽  
Pedro H. E. Becker ◽  
Joost Hoozemans ◽  
Stephan Wong ◽  
Antonio C. S. Beck
Author(s):  
Amir Mahdi Hosseini Monazzah ◽  
Amir M. Rahmani ◽  
Antonio Miele ◽  
Nikil Dutt

AbstractDue to the consistent pressing quest of larger on-chip memories and caches of multicore and manycore architectures, Spin Transfer Torque Magnetic RAM (STT-MRAM or STT-RAM) has been proposed as a promising technology to replace classical SRAMs in near-future devices. Main advantages of STT-RAMs are a considerably higher transistor density and a negligible leakage power compared with SRAM technology. However, the drawback of this technology is the high probability of errors occurring especially in write operations. Such errors are asymmetric and transition-dependent, where 0 → 1 is the most critical one, and is high subjected to the amount and current (voltage) supplied to the memory during the write operation. As a consequence, STT-RAMs present an intrinsic trade-off between energy consumption vs. reliability that needs to be properly tuned w.r.t. the currently running application and its reliability requirement. This chapter proposes FlexRel, an energy-aware reliability improvement architectural scheme for STT-RAM cache memories. FlexRel considers a memory architecture provided with Error Correction Codes (ECCs) and a custom current regulator for the various cache ways and conducts a trade-off between reliability and energy consumption. FlexRel cache controller dynamically profiles the number of 0 → 1 transitions of each individual bit write operation in a cache block and based on that selects the most-suitable cache way and current level to guarantee the necessary error rate threshold (in terms of occurred write errors) while minimizing the energy consumption. We experimentally evaluated the efficiency of FlexRel against the most efficient uniform protection scheme from reliability, energy, area, and performance perspectives. Experimental simulations performed by using gem5 has demonstrated that while FlexRel satisfies the given error rate threshold, it delivers up to 13.2% energy saving. From the area footprint perspective, FlexRel delivers up to 7.9% cache ways’ area saving. Furthermore, the performance overhead of the FlexRel algorithm which changes the traffic patterns of the cache ways during the executions is 1.7%, on average.


2017 ◽  
Vol 2 (2) ◽  
pp. 113-126 ◽  
Author(s):  
Md Sabbir Hasan ◽  
Frederico Alvares ◽  
Thomas Ledoux ◽  
Jean-Louis Pazat

Sensors ◽  
2021 ◽  
Vol 21 (1) ◽  
pp. 229
Author(s):  
Xianzhong Tian ◽  
Juan Zhu ◽  
Ting Xu ◽  
Yanjun Li

The latest results in Deep Neural Networks (DNNs) have greatly improved the accuracy and performance of a variety of intelligent applications. However, running such computation-intensive DNN-based applications on resource-constrained mobile devices definitely leads to long latency and huge energy consumption. The traditional way is performing DNNs in the central cloud, but it requires significant amounts of data to be transferred to the cloud over the wireless network and also results in long latency. To solve this problem, offloading partial DNN computation to edge clouds has been proposed, to realize the collaborative execution between mobile devices and edge clouds. In addition, the mobility of mobile devices is easily to cause the computation offloading failure. In this paper, we develop a mobility-included DNN partition offloading algorithm (MDPO) to adapt to user’s mobility. The objective of MDPO is minimizing the total latency of completing a DNN job when the mobile user is moving. The MDPO algorithm is suitable for both DNNs with chain topology and graphic topology. We evaluate the performance of our proposed MDPO compared to local-only execution and edge-only execution, experiments show that MDPO significantly reduces the total latency and improves the performance of DNN, and MDPO can adjust well to different network conditions.


2021 ◽  
Vol 18 (2) ◽  
pp. 1-24
Author(s):  
Nhut-Minh Ho ◽  
Himeshi De silva ◽  
Weng-Fai Wong

This article presents GRAM (<underline>G</underline>PU-based <underline>R</underline>untime <underline>A</underline>daption for <underline>M</underline>ixed-precision) a framework for the effective use of mixed precision arithmetic for CUDA programs. Our method provides a fine-grain tradeoff between output error and performance. It can create many variants that satisfy different accuracy requirements by assigning different groups of threads to different precision levels adaptively at runtime . To widen the range of applications that can benefit from its approximation, GRAM comes with an optional half-precision approximate math library. Using GRAM, we can trade off precision for any performance improvement of up to 540%, depending on the application and accuracy requirement.


Energies ◽  
2021 ◽  
Vol 14 (14) ◽  
pp. 4089
Author(s):  
Kaiqiang Zhang ◽  
Dongyang Ou ◽  
Congfeng Jiang ◽  
Yeliang Qiu ◽  
Longchuan Yan

In terms of power and energy consumption, DRAMs play a key role in a modern server system as well as processors. Although power-aware scheduling is based on the proportion of energy between DRAM and other components, when running memory-intensive applications, the energy consumption of the whole server system will be significantly affected by the non-energy proportion of DRAM. Furthermore, modern servers usually use NUMA architecture to replace the original SMP architecture to increase its memory bandwidth. It is of great significance to study the energy efficiency of these two different memory architectures. Therefore, in order to explore the power consumption characteristics of servers under memory-intensive workload, this paper evaluates the power consumption and performance of memory-intensive applications in different generations of real rack servers. Through analysis, we find that: (1) Workload intensity and concurrent execution threads affects server power consumption, but a fully utilized memory system may not necessarily bring good energy efficiency indicators. (2) Even if the memory system is not fully utilized, the memory capacity of each processor core has a significant impact on application performance and server power consumption. (3) When running memory-intensive applications, memory utilization is not always a good indicator of server power consumption. (4) The reasonable use of the NUMA architecture will improve the memory energy efficiency significantly. The experimental results show that reasonable use of NUMA architecture can improve memory efficiency by 16% compared with SMP architecture, while unreasonable use of NUMA architecture reduces memory efficiency by 13%. The findings we present in this paper provide useful insights and guidance for system designers and data center operators to help them in energy-efficiency-aware job scheduling and energy conservation.


2021 ◽  
Vol 34 (5) ◽  
pp. 303-318
Author(s):  
Maarten Baele ◽  
An Vermeulen ◽  
Dimitri Adons ◽  
Roos Peeters ◽  
Angelique Vandemoortele ◽  
...  

Author(s):  
Harold O. Fried ◽  
Loren W. Tauer

This article explores how well an individual manages his or her own talent to achieve high performance in an individual sport. Its setting is the Ladies Professional Golf Association (LPGA). The order-m approach is explained. Additionally, the data and the empirical findings are presented. The inputs measure fundamental golfing athletic ability. The output measures success on the LPGA tour. The correlation coefficient between earnings per event and the ability to perform under pressure is 0.48. The careers of golfers occur on the front end of the age distribution. There is a classic trade-off between the inevitable deterioration in the mental ability to handle the pressure and experience gained with time. The ability to perform under pressure peaks at age 37.


Sign in / Sign up

Export Citation Format

Share Document