Accurate power estimation of hardware co-processors using system level simulation

Author(s):  
Sumit Ahuja ◽  
Deepak A. Mathaikutty ◽  
Avinash Lakshminarayana ◽  
Sandeep Shukla
2022 ◽  
Vol 21 (1) ◽  
pp. 1-25
Author(s):  
Kazi Asifuzzaman ◽  
Rommel Sánchez Verdejo ◽  
Petar Radojković

It is questionable whether DRAM will continue to scale and will meet the needs of next-generation systems. Therefore, significant effort is invested in research and development of novel memory technologies. One of the candidates for next-generation memory is Spin-Transfer Torque Magnetic Random Access Memory (STT-MRAM). STT-MRAM is an emerging non-volatile memory with a lot of potential that could be exploited for various requirements of different computing systems. Being a novel technology, STT-MRAM devices are already approaching DRAM in terms of capacity, frequency, and device size. Although STT-MRAM technology got significant attention of various major memory manufacturers, academic research of STT-MRAM main memory remains marginal. This is mainly due to the unavailability of publicly available detailed timing and current parameters of this novel technology, which are required to perform a reliable main memory simulation on performance and power estimation. This study demonstrates an approach to perform a cycle accurate simulation of STT-MRAM main memory, being the first to release detailed timing and current parameters of this technology from academia—essentially enabling researchers to conduct reliable system-level simulation of STT-MRAM using widely accepted existing simulation infrastructure. The results show a fairly narrow overall performance deviation in response to significant variations in key timing parameters, and the power consumption experiments identify the key power component that is mostly affected with STT-MRAM.


Electronics ◽  
2021 ◽  
Vol 10 (6) ◽  
pp. 644
Author(s):  
Michal Frivaldsky ◽  
Jan Morgos ◽  
Michal Prazenica ◽  
Kristian Takacs

In this paper, we describe a procedure for designing an accurate simulation model using a price-wised linear approach referred to as the power semiconductor converters of a DC microgrid concept. Initially, the selection of topologies of individual power stage blocs are identified. Due to the requirements for verifying the accuracy of the simulation model, physical samples of power converters are realized with a power ratio of 1:10. The focus was on optimization of operational parameters such as real-time behavior (variable waveforms within a time domain), efficiency, and the voltage/current ripples. The approach was compared to real-time operation and efficiency performance was evaluated showing the accuracy and suitability of the presented approach. The results show the potential for developing complex smart grid simulation models, with a high level of accuracy, and thus the possibility to investigate various operational scenarios and the impact of power converter characteristics on the performance of a smart gird. Two possible operational scenarios of the proposed smart grid concept are evaluated and demonstrate that an accurate hardware-in-the-loop (HIL) system can be designed.


2021 ◽  
Vol 18 (4) ◽  
pp. 1-27
Author(s):  
Yasir Mahmood Qureshi ◽  
William Andrew Simon ◽  
Marina Zapater ◽  
Katzalin Olcoz ◽  
David Atienza

The increasing adoption of smart systems in our daily life has led to the development of new applications with varying performance and energy constraints, and suitable computing architectures need to be developed for these new applications. In this article, we present gem5-X, a system-level simulation framework, based on gem-5, for architectural exploration of heterogeneous many-core systems. To demonstrate the capabilities of gem5-X, real-time video analytics is used as a case-study. It is composed of two kernels, namely, video encoding and image classification using convolutional neural networks (CNNs). First, we explore through gem5-X the benefits of latest 3D high bandwidth memory (HBM2) in different architectural configurations. Then, using a two-step exploration methodology, we develop a new optimized clustered-heterogeneous architecture with HBM2 in gem5-X for video analytics application. In this proposed clustered-heterogeneous architecture, ARMv8 in-order cluster with in-cache computing engine executes the video encoding kernel, giving 20% performance and 54% energy benefits compared to baseline ARM in-order and Out-of-Order systems, respectively. Furthermore, thanks to gem5-X, we conclude that ARM Out-of-Order clusters with HBM2 are the best choice to run visual recognition using CNNs, as they outperform DDR4-based system by up to 30% both in terms of performance and energy savings.


Sign in / Sign up

Export Citation Format

Share Document