Energy efficient on-chip power delivery with run-time voltage regulator clustering

Author(s):  
Divya Pathak ◽  
Mohammad Hossein Hajkazemi ◽  
Mohammad Khavari Tavana ◽  
Houman Homayoun ◽  
Ioannis Savidis
2021 ◽  
Author(s):  
Akram Hadeed

Recently, technology scaling has enabled the placement of an increasing number of cores, in the form of chip-multiprocessors (CMPs) on a chip and continually shrinking transistor sizes to improve performance. In this context, power consumption has become the main constraint in designing CMPs. As a result, uncore components power consumption taking increasing portion from the on-chip power budget; therefore, designing power management techniques, particularly memory and network-on-chip (NoC) systems, has become an important issue to solve. Consequently, a considerable attention has been directed toward power management based on CMPs components, particularly shared caches and uncore interconnected structures, to overcome the challenges of limited chip power budget.<div>This work targets to design an energy-efficient uncore architecture by using heterogeneity in components (cache cells) and operational parameters (Voltage/Frequency). In order to ensure the minimum impact on the system performance, a run-time approach is investigated to assess the proposed method. An architecture is proposed where the cache layer contains the heterogenous cache banks in all placed in one frequency voltage domain. Average memory access time (AMAT) was selected as a network monitor to monitor the performance on the run-time. The appropriate size and type of the last level cache (LLC) and Voltage/Frequency for the uncore domain is adjusted according to the calculated AMAT which indicates the system demand from the uncore.<br></div><div>The proposed hybrid architecture was implemented, investigated and compared with the a baseline model where only SRAM banks were used in the last level cache. Experimental results on the Princeton Application Repository for Shared-Memory Computers (PARSEC) benchmark suit,show that the proposed architecture yields up to a 40% reduction in overall chip energy-delay product with a marginal performance degradation in average of -1.2% below the baseline one. The best energy saving was 55% and the worse degradation was only 15%.<br></div>


2017 ◽  
Vol 32 (1) ◽  
pp. 378-393 ◽  
Author(s):  
Toke M. Andersen ◽  
Florian Krismer ◽  
Johann W. Kolar ◽  
Thomas Toifl ◽  
Christian Menolfi ◽  
...  

2021 ◽  
Author(s):  
Akram Hadeed

Recently, technology scaling has enabled the placement of an increasing number of cores, in the form of chip-multiprocessors (CMPs) on a chip and continually shrinking transistor sizes to improve performance. In this context, power consumption has become the main constraint in designing CMPs. As a result, uncore components power consumption taking increasing portion from the on-chip power budget; therefore, designing power management techniques, particularly memory and network-on-chip (NoC) systems, has become an important issue to solve. Consequently, a considerable attention has been directed toward power management based on CMPs components, particularly shared caches and uncore interconnected structures, to overcome the challenges of limited chip power budget.<div>This work targets to design an energy-efficient uncore architecture by using heterogeneity in components (cache cells) and operational parameters (Voltage/Frequency). In order to ensure the minimum impact on the system performance, a run-time approach is investigated to assess the proposed method. An architecture is proposed where the cache layer contains the heterogenous cache banks in all placed in one frequency voltage domain. Average memory access time (AMAT) was selected as a network monitor to monitor the performance on the run-time. The appropriate size and type of the last level cache (LLC) and Voltage/Frequency for the uncore domain is adjusted according to the calculated AMAT which indicates the system demand from the uncore.<br></div><div>The proposed hybrid architecture was implemented, investigated and compared with the a baseline model where only SRAM banks were used in the last level cache. Experimental results on the Princeton Application Repository for Shared-Memory Computers (PARSEC) benchmark suit,show that the proposed architecture yields up to a 40% reduction in overall chip energy-delay product with a marginal performance degradation in average of -1.2% below the baseline one. The best energy saving was 55% and the worse degradation was only 15%.<br></div>


2013 ◽  
Vol 28 (1) ◽  
pp. 54-71 ◽  
Author(s):  
Xiao-Hang Wang ◽  
Peng Liu ◽  
Mei Yang ◽  
Maurizio Palesi ◽  
Ying-Tao Jiang ◽  
...  

Sign in / Sign up

Export Citation Format

Share Document