3D Stacked Cache Data Management for Energy Minimization of 3D Chip Multiprocessor

Author(s):  
K. Suresh Kumar ◽  
S. Anitha ◽  
M. Gayathri

In this model a runtime cache data mapping is discussed for 3-D stacked L2 caches to minimize the overall energy of 3-D chip multiprocessors (CMPs). The suggested method considers both temperature distribution and memory traffic of 3-D CMPs. Experimental result shows energy reduction achieving up to 22.88% compared to an existing solution which considers only the temperature distribution.  New tendencies envisage 3D Multi-Processor System-On-Chip (MPSoC) design as a promising solution to keep increasing the performance of the next-generation high performance computing (HPC) systems. However, as the power density of HPC systems increases with the arrival of 3D MPSoCs with energy reduction achieving up to 19.55% by supplying electrical power to the computing equipment and constantly removing the generated heat is rapidly becoming the dominant cost in any HPC facility.

Author(s):  
Xiaohan Tao ◽  
Jianmin Pang ◽  
Jinlong Xu ◽  
Yu Zhu

AbstractThe heterogeneous many-core architecture plays an important role in the fields of high-performance computing and scientific computing. It uses accelerator cores with on-chip memories to improve performance and reduce energy consumption. Scratchpad memory (SPM) is a kind of fast on-chip memory with lower energy consumption compared with a hardware cache. However, data transfer between SPM and off-chip memory can be managed only by a programmer or compiler. In this paper, we propose a compiler-directed multithreaded SPM data transfer model (MSDTM) to optimize the process of data transfer in a heterogeneous many-core architecture. We use compile-time analysis to classify data accesses, check dependences and determine the allocation of data transfer operations. We further present the data transfer performance model to derive the optimal granularity of data transfer and select the most profitable data transfer strategy. We implement the proposed MSDTM on the GCC complier and evaluate it on Sunway TaihuLight with selected test cases from benchmarks and scientific computing applications. The experimental result shows that the proposed MSDTM improves the application execution time by 5.49$$\times$$ × and achieves an energy saving of 5.16$$\times$$ × on average.


2012 ◽  
Vol 58 (1) ◽  
pp. 9-14 ◽  
Author(s):  
Dawid Zydek ◽  
Grzegorz Chmaj ◽  
Alaa Shawky ◽  
Henry Selvaraj

Location of Processor Allocator and Job Scheduler and Its Impact on CMP PerformanceHigh Performance Computing (HPC) architectures are being developed continually with an aim of achieving exascale capability by 2020. Processors that are being developed and used as nodes in HPC systems are Chip Multiprocessors (CMPs) with a number of cores. In this paper, we continue our effort towards a better processor allocation process. The Processor Allocator (PA) and Job Scheduler (JS) proposed and implemented in our previous works are explored in the context of its best location on the chip. We propose a system, where all locations on a chip can be analyzed, considering energy used by Network-on-Chip (NoC), PA and JS, and processing elements. We present energy models for the researched CMP components, mathematical model of the system, and experimentation system. Based on experimental results, proper placement of PA and JS on a chip can provide up to 45% NoC energy savings.


Author(s):  
Mehdi Modarressi ◽  
Hamid Sarbazi-Azad

In this chapter, we present a reconfigurable architecture for network-on-chips (NoC) on which arbitrary application-specific topologies can be implemented. The proposed NoC can dynamically tailor its topology to the traffic pattern of different applications, aiming to address one of the main drawbacks of existing application-specific NoC optimization methods, i.e. optimizing NoCs based on the traffic pattern of a single application. Supporting multiple applications is a critical feature of an NoC as several different applications are integrated into the modern and complex multi-core system-on-chips and chip multiprocessors and an NoC that is designed to run exactly one application does not necessarily meet the design constraints of other applications. The proposed NoC supports multiple applications by configuring as a topology which matches the traffic pattern of the currently running application in the best way. In this chapter, we first introduce the proposed reconfigurable topology and then address the two problems of core to network mapping and topology exploration. Experimental results show that this architecture effectively improves the performance of NoCs and reduces power consumption.


2020 ◽  
Vol 10 (4) ◽  
pp. 37
Author(s):  
Habiba Lahdhiri ◽  
Jordane Lorandel ◽  
Salvatore Monteleone ◽  
Emmanuelle Bourdel ◽  
Maurizio Palesi

The Network-on-chip (NoC) paradigm has been proposed as a promising solution to enable the handling of a high degree of integration in multi-/many-core architectures. Despite their advantages, wired NoC infrastructures are facing several performance issues regarding multi-hop long-distance communications. RF-NoC is an attractive solution offering high performance and multicast/broadcast capabilities. However, managing RF links is a critical aspect that relies on both application-dependent and architectural parameters. This paper proposes a design space exploration framework for OFDMA-based RF-NoC architecture, which takes advantage of both real application benchmarks simulated using Sniper and RF-NoC architecture modeled using Noxim. We adopted the proposed framework to finely configure a routing algorithm, working with real traffic, achieving up to 45% of delay reduction, compared to a wired NoC setup in similar conditions.


2013 ◽  
Vol 849 ◽  
pp. 302-309
Author(s):  
Yun Xu ◽  
Xin Hua Zhu ◽  
Yu Wang

With rapid development of micro fabrication technology, the performance of MIMU has gradually improved. The MIMU introduced in this paper is based on the silicon micro machined gyroscope of type MSG7000D and accelerometer of type MSA6000. The volume of it is 3×3×3cm3, the mass is 68.5g and the power consumption is less than 1w. The experimental result shows that the bias stability of the gyroscope and accelerometer for each axis of the designed MIMU is less than 10°/h and 0.5mg respectively. For the non orthogonality in three axes of the structure, MIMU needs to be calibrated. After calibration, the measurement accuracy has improved by an order of magnitude. The designed MIMU can satisfy the requirement of high performance, low cost, light weight and small size for strap-down navigation system, thus it can be widely applied not only to the field of vehicles integrated navigation, attitude measurement but also to the fields of personal goods such as mobile, game consoles and so on.


2019 ◽  
Vol 10 (1) ◽  
Author(s):  
Ying-Qi Li ◽  
Hang Shi ◽  
Sheng-Bo Wang ◽  
Yi-Tong Zhou ◽  
Zi Wen ◽  
...  

Abstract Aqueous rechargeable microbatteries are promising on-chip micropower sources for a wide variety of miniaturized electronics. However, their development is plagued by state-of-the-art electrode materials due to low capacity and poor rate capability. Here we show that layered potassium vanadium oxides, KxV2O5·nH2O, have an amorphous/crystalline dual-phase nanostructure to show genuine potential as high-performance anode materials of aqueous rechargeable potassium-ion microbatteries. The dual-phase nanostructured KxV2O5·nH2O keeps large interlayer spacing while removing secondary-bound interlayer water to create sufficient channels and accommodation sites for hydrated potassium cations. This unique nanostructure facilitates accessibility/transport of guest hydrated potassium cations to significantly improve practical capacity and rate performance of the constituent KxV2O5·nH2O. The potassium-ion microbatteries with KxV2O5·nH2O anode and KxMnO2·nH2O cathode constructed on interdigital-patterned nanoporous metal current microcollectors exhibit ultrahigh energy density of 103 mWh cm−3 at electrical power comparable to carbon-based microsupercapacitors.


Author(s):  
Diandian Zhang ◽  
Han Zhang ◽  
Jeronimo Castrillon ◽  
Torsten Kempf ◽  
Bart Vanthournout ◽  
...  

Efficient runtime resource management in multi-processor systems-on-chip (MPSoCs) for achieving high performance and low energy consumption is one of the key challenges for system designers. OSIP, an operating system application-specific instruction-set processor, together with its well-defined programming model, provides a promising solution. It delivers high computational performance to deal with dynamic task scheduling and mapping. Being programmable, it can easily be adapted to different systems. However, the distributed computation among the different processing elements introduces complexity to the communication architecture, which tends to become the bottleneck of such systems. In this work, the authors highlight the vital importance of the communication architecture for OSIP-based systems and optimize the communication architecture. Furthermore, the effects of OSIP and the communication architecture are investigated jointly from the system point of view, based on a broad case study for a real life application (H.264) and a synthetic benchmark application.


2012 ◽  
Vol 2012 ◽  
pp. 1-15 ◽  
Author(s):  
Wen-Chung Tsai ◽  
Ying-Cherng Lan ◽  
Yu-Hen Hu ◽  
Sao-Jie Chen

The next generation of multiprocessor system on chip (MPSoC) and chip multiprocessors (CMPs) will contain hundreds or thousands of cores. Such a many-core system requires high-performance interconnections to transfer data among the cores on the chip. Traditional system components interface with the interconnection backbone via a bus interface. This interconnection backbone can be an on-chip bus or multilayer bus architecture. With the advent of many-core architectures, the bus architecture becomes the performance bottleneck of the on-chip interconnection framework. In contrast, network on chip (NoC) becomes a promising on-chip communication infrastructure, which is commonly considered as an aggressive long-term approach for on-chip communications. Accordingly, this paper first discusses several common architectures and prevalent techniques that can deal well with the design issues of communication performance, power consumption, signal integrity, and system scalability in an NoC. Finally, a novel bidirectional NoC (BiNoC) architecture with a dynamically self-reconfigurable bidirectional channel is proposed to break the conventional performance bottleneck caused by bandwidth restriction in conventional NoCs.


2011 ◽  
Vol 71 (5) ◽  
pp. 641-650 ◽  
Author(s):  
Gilbert Hendry ◽  
Eric Robinson ◽  
Vitaliy Gleyzer ◽  
Johnnie Chan ◽  
Luca P. Carloni ◽  
...  

Sign in / Sign up

Export Citation Format

Share Document