System-level performance analysis for designing on-chip communication architectures

The continuous downscaling in CMOS devices has increased leakage power and limited the performance to a few GHz. The research goal has diverted from operating at high frequencies to deliver higher performance in essence with lower power. CMOS based on-chip memories consumes significant fraction of power in modern processors. This paper aims to explore the suitability of beyond CMOS, emerging magnetic memories for the use in memory hierarchy, attributing to their remarkable features like nonvolatility, high-density, ultra-low leakage and scalability. NVSim, a circuit-level tool, is used to explore different design layouts and memory organizations and then estimate the energy, area and latency performance numbers. A detailed system-level performance analysis of STT-MRAM and SOT-MRAM technologies and comparison with 22[Formula: see text]nm SRAM technology are presented. Analysis infers that in comparison to the existing 22[Formula: see text]nm SRAM technology, SOT-MRAM is more efficient in area for memory size [Formula: see text][Formula: see text]KB, speed and energy consumption for cache size [Formula: see text][Formula: see text]KB. A typical 256[Formula: see text]KB SOT-MRAM cache design is 27.74% area efficient, 2.97 times faster and consumes 76.05% lesser leakage than SRAM counterpart and these numbers improve for larger cache sizes. The article deduces that SOT-MRAM technology has a promising potential to replace SRAM in lower levels of computer memory hierarchy.

Download Full-text

System-Level Analysis of MPSoCs with a Hardware Scheduler

Advances in Systems Analysis, Software Engineering, and High Performance Computing - Advancing Embedded Systems and Real-Time Communications with Emerging Technologies ◽

10.4018/978-1-4666-6034-2.ch014 ◽

2014 ◽

pp. 335-367

Author(s):

Diandian Zhang ◽

Jeronimo Castrillon ◽

Stefan Schürmans ◽

Gerd Ascheid ◽

Rainer Leupers ◽

...

Keyword(s):

Resource Management ◽

Distributed Memory ◽

Point Of View ◽

System Level ◽

Video Decoder ◽

Communication Architecture ◽

Communication Architectures ◽

Runtime Management ◽

On Chip ◽

Ip Blocks

Efficient runtime resource management in heterogeneous Multi-Processor Systems-on-Chip (MPSoCs) for achieving high performance and energy efficiency is one key challenge for system designers. In the past years, several IP blocks have been proposed that implement system-wide runtime task and resource management. As the processor count continues to increase, it is important to analyze the scalability of runtime managers at the system-level for different communication architectures. In this chapter, the authors analyze the scalability of an Application-Specific Instruction-Set Processor (ASIP) for runtime management called OSIP on two platform paradigms: shared and distributed memory. For the former, a generic bus is used as interconnect. For distributed memory, a Network-on-Chip (NoC) is used. The effects of OSIP and the communication architecture are jointly investigated from the system point of view, based on a broad case study with real applications (an H.264 video decoder and a digital receiver for wireless communications) and a synthetic benchmark application.

Download Full-text