Evaluation Method of Synchronization for Shared-Memory On-Chip Many-Core Processor

Current computing platforms encourage the integration of thousands of processing cores, and their interconnections, into a single chip. Mobile smartphones, IoT, embedded devices, desktops, and data centers use Many-Core Systems-on-Chip (SoCs) to exploit their compute power and parallelism to meet the dynamic workload requirements. Networks-on-Chip (NoCs) lead to scalable connectivity for diverse applications with distinct traffic patterns and data dependencies. However, when the system executes various applications in traditional NoCs—optimized and fixed at synthesis time—the interconnection nonconformity with the different applications’ requirements generates limitations in the performance. In the literature, NoC designs embraced the Software-Defined Networking (SDN) strategy to evolve into an adaptable interconnection solution for future chips. However, the works surveyed implement a partial Software-Defined Network-on-Chip (SDNoC) approach, leaving aside the SDN layered architecture that brings interoperability in conventional networking. This paper explores the SDNoC literature and classifies it regarding the desired SDN features that each work presents. Then, we described the challenges and opportunities detected from the literature survey. Moreover, we explain the motivation for an SDNoC approach, and we expose both SDN and SDNoC concepts and architectures. We observe that works in the literature employed an uncomplete layered SDNoC approach. This fact creates various fertile areas in the SDNoC architecture where researchers may contribute to Many-Core SoCs designs.

Download Full-text

Near-Optimal Thermal Monitoring Framework for Many-Core Systems-on-Chip

IEEE Transactions on Computers ◽

10.1109/tc.2015.2395423 ◽

2015 ◽

Vol 64 (11) ◽

pp. 3197-3209 ◽

Cited By ~ 2

Author(s):

Juri Ranieri ◽

Alessandro Vincenzi ◽

Amina Chebira ◽

David Atienza ◽

Martin Vetterli

Keyword(s):

Thermal Monitoring ◽

Systems On Chip ◽

Monitoring Framework ◽

On Chip ◽

Many Core

Download Full-text

MCVP-NoC: Many-Core Virtual Platform with Networks-on-Chip support

2013 IEEE 10th International Conference on ASIC ◽

10.1109/asicon.2013.6811836 ◽

2013 ◽

Author(s):

Dexue Zhang ◽

Xiaoyang Zeng ◽

Zongyan Wang ◽

Weike Wang ◽

Xinhua Chen

Keyword(s):

Networks On Chip ◽

Virtual Platform ◽

On Chip ◽

Many Core

Download Full-text

FoToNoC: A Folded Torus-Like Network-on-Chip Based Many-Core Systems-on-Chip in the Dark Silicon Era

IEEE Transactions on Parallel and Distributed Systems ◽

10.1109/tpds.2016.2643669 ◽

2017 ◽

Vol 28 (7) ◽

pp. 1905-1918 ◽

Cited By ~ 16

Author(s):

Lei Yang ◽

Weichen Liu ◽

Weiwen Jiang ◽

Mengquan Li ◽

Peng Chen ◽

...

Keyword(s):

Network On Chip ◽

Dark Silicon ◽

Systems On Chip ◽

On Chip ◽

Many Core

Download Full-text

Compiler-directed scratchpad memory data transfer optimization for multithreaded applications on a heterogeneous many-core architecture

The Journal of Supercomputing ◽

10.1007/s11227-021-03853-x ◽

2021 ◽

Author(s):

Xiaohan Tao ◽

Jianmin Pang ◽

Jinlong Xu ◽

Yu Zhu

Keyword(s):

Energy Consumption ◽

High Performance ◽

Scientific Computing ◽

Data Transfer ◽

Performance Model ◽

Experimental Result ◽

Transfer Model ◽

Scratchpad Memory ◽

On Chip ◽

Many Core

AbstractThe heterogeneous many-core architecture plays an important role in the fields of high-performance computing and scientific computing. It uses accelerator cores with on-chip memories to improve performance and reduce energy consumption. Scratchpad memory (SPM) is a kind of fast on-chip memory with lower energy consumption compared with a hardware cache. However, data transfer between SPM and off-chip memory can be managed only by a programmer or compiler. In this paper, we propose a compiler-directed multithreaded SPM data transfer model (MSDTM) to optimize the process of data transfer in a heterogeneous many-core architecture. We use compile-time analysis to classify data accesses, check dependences and determine the allocation of data transfer operations. We further present the data transfer performance model to derive the optimal granularity of data transfer and select the most profitable data transfer strategy. We implement the proposed MSDTM on the GCC complier and evaluate it on Sunway TaihuLight with selected test cases from benchmarks and scientific computing applications. The experimental result shows that the proposed MSDTM improves the application execution time by 5.49$$\times$$ × and achieves an energy saving of 5.16$$\times$$ × on average.

Download Full-text

Advanced Extensible Crossbar Protocol for Connecting Multi-Cores and Shared-Memory on-Chip

2018 8th International Conference on Electronics Information and Emergency Communication (ICEIEC) ◽

10.1109/iceiec.2018.8473474 ◽

2018 ◽

Author(s):

Hongyu Meng ◽

Donglin Wang ◽

Ziiun Liu ◽

Yang Guo

Keyword(s):

Shared Memory ◽

On Chip

Download Full-text

Memory Map: A Multiprocessor Cache Simulator

Journal of Electrical and Computer Engineering ◽

10.1155/2012/365091 ◽

2012 ◽

Vol 2012 ◽

pp. 1-12 ◽

Cited By ~ 4

Author(s):

Shaily Mittal ◽

Nitin

Keyword(s):

Shared Memory ◽

Data Flow ◽

Memory Systems ◽

System On Chip ◽

Multiprocessor System ◽

Flow Management ◽

Hit Rate ◽

Multiple Processors ◽

On Chip ◽

Cache Miss

Nowadays, Multiprocessor System-on-Chip (MPSoC) architectures are mainly focused on by manufacturers to provide increased concurrency, instead of increased clock speed, for embedded systems. However, managing concurrency is a tough task. Hence, one major issue is to synchronize concurrent accesses to shared memory. An important characteristic of any system design process is memory configuration and data flow management. Although, it is very important to select a correct memory configuration, it might be equally imperative to choreograph the data flow between various levels of memory in an optimal manner. Memory map is a multiprocessor simulator to choreograph data flow in individual caches of multiple processors and shared memory systems. This simulator allows user to specify cache reconfigurations and number of processors within the application program and evaluates cache miss and hit rate for each configuration phase taking into account reconfiguration costs. The code is open source and in java.

Download Full-text