Cluster Based Networks-on-Chip

Author(s):  
Khalid Latif ◽  
Amir-Mohammad Rahmani ◽  
Tiberiu Seceleanu ◽  
Hannu Tenhunen

Partial Virtual channel Sharing (PVS) architecture has been proposed to enhance the performance of Networks-on-Chip (NoC) based systems. In this paper, the authors present an efficient and reliable Network Interface (NI) assisted routing strategy for NoC using PVS architecture. For this purpose, NoC system is divided into clusters. Each cluster is a group of two nodes comprising Processing Elements (PE), switches, links, etc. Each PE in a cluster can inject data to the network through a router, which is closer to the destination. This helps to reduce the network load by reducing the average hop count of the network. The proposed architecture can recover the PE disconnected from the network due to network level faults by allowing the PE to transmit and receive the packets through the other router in the cluster. 5×6 crossbar is used for the proposed architecture which requires one more 5×1 multiplexer without increasing the critical path delay of the router as compared to the 5×5 crossbar. The proposed router has been simulated for uniform, transpose and negative exponential distribution (NED) traffic patterns. The simulation results show the significant reduction in average packet latency at the expense of negligible area overhead.

Author(s):  
Simon J. Hollis ◽  
Chris Jackson

The Skip-link architecture dynamically reconfigures Network-on-Chip (NoC) topologies in order to reduce the overall switching activity in many-core systems. The proposed architecture allows the creation of long-range Skip-links at runtime to reduce the logical distance between frequently communicating nodes. This offers a number of advantages over existing methods of creating optimised topologies already present in research, such as the Reconfigurable NoC (ReNoC) architecture and static Long-Range Link (LRL) insertion. This architecture monitors traffic behaviour and optimises the mesh topology without prior analysis of communications behaviour, and is thus applicable to all applications. The technique described here does not utilise a master node, and each router acts independently. The architecture is thus scalable to future many-core networks. The authors evaluate the performance using a cycle-accurate simulator with synthetic traffic patterns and compare the results to a mesh architecture, demonstrating logical hop count reductions of 12-17%. Coupled with this, up to a doubling in critical load is observed, and the potential for 10% energy reductions on a 16×16 node network.


Author(s):  
Amit Chaurasia ◽  
Vivek Kumar Sehgal

In this paper, we have worked on the bursty synthetic traffic for Gaussian and Non-Gaussian traffic traces on the NoC architecture. This is the first study on the performance of Gaussian and Non-Gaussian application traffic on the multicore architectures. The real-time traffic having the marginal distribution are Non-Gaussian in nature, so any analytical studies or simulations will not be accurate, and does not capture the true characteristics of application traffic. Simulation is performed on synthetic generated traces for Gaussian and Non-Gaussian traffic for different traffic patterns. The performance of the two traffics is validated by simulating the parameters of packet loss-probability, average link-utilization & average end-to-end latency shows that the Non-Gaussian traffic captures the burstiness more effectively as compared to the Gaussian traffic for the desired application.


Author(s):  
Simon J. Hollis ◽  
Chris Jackson

The Skip-link architecture dynamically reconfigures Network-on-Chip (NoC) topologies in order to reduce the overall switching activity in many-core systems. The proposed architecture allows the creation of long-range Skip-links at runtime to reduce the logical distance between frequently communicating nodes. This offers a number of advantages over existing methods of creating optimised topologies already present in research, such as the Reconfigurable NoC (ReNoC) architecture and static Long-Range Link (LRL) insertion. This architecture monitors traffic behaviour and optimises the mesh topology without prior analysis of communications behaviour, and is thus applicable to all applications. The technique described here does not utilise a master node, and each router acts independently. The architecture is thus scalable to future many-core networks. The authors evaluate the performance using a cycle-accurate simulator with synthetic traffic patterns and compare the results to a mesh architecture, demonstrating logical hop count reductions of 12-17%. Coupled with this, up to a doubling in critical load is observed, and the potential for 10% energy reductions on a 16×16 node network.


2015 ◽  
Vol 19 (1) ◽  
pp. 14 ◽  
Author(s):  
Burhan Khurshid ◽  
Roohie Naaz

Modern day field programmable gate arrays(FPGAs) have very huge and versatile logic resources resulting inthe migration of their application domain from prototypedesigning to low and medium volume production designing.Unfortunately most of the work pertaining to FPGAimplementations does not focus on the technology dependentoptimizations that can implement a desired functionality withreduced cost. In this paper we consider the mapping of simpleripple carry fixed-point adders (RCA) on look-up table (LUT)based FPGAs. The objective is to transform the given RCABoolean network into an optimized circuit netlist that canimplement the desired functionality with minimum cost. Weparticularly focus on 6-input LUTs that are inherent in all themodern day FPGAs. Technology dependent optimizations arecarried out to utilize this FPGA primitive efficiently and theresult is compared against various adder designs. Theimplementation targets the XC5VLX30-3FF324 device fromXilinx Virtex-5 FPGA family. The cost of the circuit is expressedin terms of the resources utilized, critical path delay and theamount of on-chip power dissipated. Our implementation resultsshow a reduction in resources usage by at least 50%; increase inspeed by at least 10% and reduction in dynamic powerdissipation by at least 30%. All this is achieved without anytechnology independent (architectural) modification.


Author(s):  
Ashima Arora ◽  
Neeraj K Shukla ◽  
Shaloo Kikan

Networks on chip are being developed as a communication infrastructure in the design of Multiprocessor SOCs. With the reduction in feature size, transient faults on the links are becoming a major issue on the performance of NOCs. In this paper, two fault-tolerant algorithms are proposed. In the first algorithm, a faulty link tolerant algorithm is designed which by measuring network loads on the links will reduce transient faults and balances the load. To address the effect of hardware faults, fault and congestion controlled algorithm is designed that not only control the congestion, but also the faults on both links and the nodes. The proposed strategies are evaluated on two different synthetic traffic patterns and the results so obtained shows better network and hardware performance of both the routing in comparison with non-fault-tolerant routing.


Sign in / Sign up

Export Citation Format

Share Document