scholarly journals Congestion aware adaptive routing for network-on-chip communication

Author(s):  
Stephen Chui

Network-On-Chip (NoC) has surpassed the traditional bus based on-chip communication in offering better performance for data transfers among many processing, peripheral and other cores of high performance embedded systems. Adaptive routing provides an effective way of efficient on-chip communication among NoC cores. The message routing efficiency can further improve the performance of NoC based embedded systems on a chip. Congestion awareness has been applied to adaptive routing for achieving better data throughput and latency. This thesis presents a novel approach of analyzing congestion to improve NoC throughput by improving packet allocation in NoC routers. The routers would have the knowledge of the traffic conditions around themselves by utilizing the congestion information. We employ header flits to store the congestion information that does not require any additional communication links between the routers. By prioritizing data packets that are likely to suffer the worst congestion would improve overall NoC data transfer latency.

2021 ◽  
Author(s):  
Stephen Chui

Network-On-Chip (NoC) has surpassed the traditional bus based on-chip communication in offering better performance for data transfers among many processing, peripheral and other cores of high performance embedded systems. Adaptive routing provides an effective way of efficient on-chip communication among NoC cores. The message routing efficiency can further improve the performance of NoC based embedded systems on a chip. Congestion awareness has been applied to adaptive routing for achieving better data throughput and latency. This thesis presents a novel approach of analyzing congestion to improve NoC throughput by improving packet allocation in NoC routers. The routers would have the knowledge of the traffic conditions around themselves by utilizing the congestion information. We employ header flits to store the congestion information that does not require any additional communication links between the routers. By prioritizing data packets that are likely to suffer the worst congestion would improve overall NoC data transfer latency.


2016 ◽  
Vol 13 (10) ◽  
pp. 7592-7598
Author(s):  
J Kalaivani ◽  
B Vinayagasundaram

The Network-on-Chip (NoC) systems have emerged in on-chip communication architecture in various fields. To achieve excellent results in Network on Chip (NoC) systems application, the routing must eliminate the deadlock issues from the network. To overcome this issue in the network, in this paper, we propose Deadlock Free Load Balanced Adaptive Routing. In this approach, Oblivious Routing (OR) algorithm is implemented on the channel by using the probability function. The network considers the capacity of the node and tries to maximize the throughput based on the connectivity between the data packets flow and minimize the channel load. A Reconfiguration Protocol is used for the data packets to choose other channel in the network if the deadlock occurs. Simulation results show that this approach reduces the delay and packet loss in the network.


Author(s):  
A. Ferrerón Labari ◽  
D. Suárez Gracia ◽  
V. Viñals Yúfera

In the last years, embedded systems have evolved so that they offer capabilities we could only find before in high performance systems. Portable devices already have multiprocessors on-chip (such as PowerPC 476FP or ARM Cortex A9 MP), usually multi-threaded, and a powerful multi-level cache memory hierarchy on-chip. As most of these systems are battery-powered, the power consumption becomes a critical issue. Achieving high performance and low power consumption is a high complexity challenge where some proposals have been already made. Suarez et al. proposed a new cache hierarchy on-chip, the LP-NUCA (Low Power NUCA), which is able to reduce the access latency taking advantage of NUCA (Non-Uniform Cache Architectures) properties. The key points are decoupling the functionality, and utilizing three specialized networks on-chip. This structure has been proved to be efficient for data hierarchies, achieving a good performance and reducing the energy consumption. On the other hand, instruction caches have different requirements and characteristics than data caches, contradicting the low-power embedded systems requirements, especially in SMT (simultaneous multi-threading) environments. We want to study the benefits of utilizing small tiled caches for the instruction hierarchy, so we propose a new design, ID-LP-NUCAs. Thus, we need to re-evaluate completely our previous design in terms of structure design, interconnection networks (including topologies, flow control and routing), content management (with special interest in hardware/software content allocation policies), and structure sharing. In CMP environments (chip multiprocessors) with parallel workloads, coherence plays an important role, and must be taken into consideration.


2021 ◽  
Vol 2 ◽  
pp. 485-496
Author(s):  
Kasem Khalil ◽  
Omar Eldash ◽  
Ashok Kumar ◽  
Magdy Bayoumi

2011 ◽  
Vol 474-476 ◽  
pp. 413-416
Author(s):  
Jia Jia ◽  
Duan Zhou ◽  
Jian Xian Zhang

In this paper, we propose a novel adaptive routing algorithm to solve the communication congestion problem for Network-on-Chip (NoC). The strategy competing for output ports in both X and Y directions is employed to utilize the output ports of the router sufficiently, and to reduce the transmission latency and improve the throughput. Experimental results show that the proposed algorithm is very effective in relieving the communication congestion, and a reduction in average latency by 45.7% and an improvement in throughput by 44.4% are achieved compared with the deterministic XY routing algorithm and the simple XY adaptive routing algorithm.


Author(s):  
Xiaohan Tao ◽  
Jianmin Pang ◽  
Jinlong Xu ◽  
Yu Zhu

AbstractThe heterogeneous many-core architecture plays an important role in the fields of high-performance computing and scientific computing. It uses accelerator cores with on-chip memories to improve performance and reduce energy consumption. Scratchpad memory (SPM) is a kind of fast on-chip memory with lower energy consumption compared with a hardware cache. However, data transfer between SPM and off-chip memory can be managed only by a programmer or compiler. In this paper, we propose a compiler-directed multithreaded SPM data transfer model (MSDTM) to optimize the process of data transfer in a heterogeneous many-core architecture. We use compile-time analysis to classify data accesses, check dependences and determine the allocation of data transfer operations. We further present the data transfer performance model to derive the optimal granularity of data transfer and select the most profitable data transfer strategy. We implement the proposed MSDTM on the GCC complier and evaluate it on Sunway TaihuLight with selected test cases from benchmarks and scientific computing applications. The experimental result shows that the proposed MSDTM improves the application execution time by 5.49$$\times$$ × and achieves an energy saving of 5.16$$\times$$ × on average.


Sign in / Sign up

Export Citation Format

Share Document