scholarly journals NoCGuard: A Reliable Network-on-Chip Router Architecture

Electronics ◽  
2020 ◽  
Vol 9 (2) ◽  
pp. 342 ◽  
Author(s):  
Muhammad Akmal Shafique ◽  
Naveed Khan Baloch ◽  
Muhammad Iram Baig ◽  
Fawad Hussain ◽  
Yousaf Bin Zikria ◽  
...  

Aggressive scaling in deep nanometer technology enables chip multiprocessor design facilitated by the communication-centric architecture provided by Network-on-Chip (NoC). At the same time, it brings considerable challenges in reliability because a fault in the network architecture severely impacts the performance of a system. To deal with these reliability challenges, this research proposed NoCGuard, a reconfigurable architecture designed to tolerate multiple permanent faults in each pipeline stage of the generic router. NoCGuard router architecture uses four highly reliable and low-cost fault-tolerant strategies. We exploited resource borrowing and double routing strategy for the routing computation stage, default winner strategy for the virtual channel allocation stage, runtime arbiter selection and default winner strategy for the switch allocation stage and multiple secondary bypass paths strategy for the crossbar stage. Unlike existing reliable router architectures, our architecture features less redundancy, more fault tolerance, and high reliability. Reliability comparison using Mean Time to Failure (MTTF) metric shows 5.53-time improvement in a lifetime and using Silicon Protection Factor (SPF), 22-time improvement, which is better than state-of-the-art reliable router architectures. Synthesis results using 15 nm and 45 nm technology library show that additional circuitry incurs an area overhead of 28.7% and 28% respectively. Latency analysis using synthetic, PARSEC and SPLASH-2 traffic shows minor increase in performance by 3.41%, 12% and 15% respectively while providing high reliability.

2020 ◽  
Vol 26 (4) ◽  
pp. 307-323
Author(s):  
Chakib Nehnouh

The Network-on-Chip (NoC) has become a promising communication infrastructure for Multiprocessors-System-on-Chip (MPSoC). Reliability is a main concern in NoC and performance is degraded when NoC is susceptible to faults. A fault can be determined as a cause of deviation from the desired operation of the system (error). To deal with these reliability challenges, this work propose OFDIM (Online Fault Detection and Isolation Mechanism),a novel combined methodology to tolerate multiple permanent and transient faults. The new router architecture uses two modules to assure highly reliable and low-cost fault-tolerant strategy. In contrast to existing works, our architecture presents less area, more fault tolerance, and high reliability. The reliability comparison using Silicon Protection Factor (SPF), shows 22-time improvement and that additional circuitry incurs an area overhead of 27%, which is better than state-of-the-art reliable router architectures. Also, the results show that the throughput decreases only by 5.19% and minor increase in average latency 2.40% while providing high reliability.


Sensors ◽  
2020 ◽  
Vol 20 (18) ◽  
pp. 5355
Author(s):  
Muhammad Rashid ◽  
Naveed Khan Baloch ◽  
Muhammad Akmal Shafique ◽  
Fawad Hussain ◽  
Shahroon Saleem ◽  
...  

Network-on-chip (NoC) architectures have become a popular communication platform for heterogeneous computing systems owing to their scalability and high performance. Aggressive technology scaling makes these architectures prone to both permanent and transient faults. This study focuses on the tolerance of a NoC router to permanent faults. A permanent fault in a NoC router severely impacts the performance of the entire network. Thus, it is necessary to incorporate component-level protection techniques in a router. In the proposed scheme, the input port utilizes a bypass path, virtual channel (VC) queuing, and VC closing strategies. Moreover, the routing computation stage utilizes spatial redundancy and double routing strategies, and the VC allocation stage utilizes spatial redundancy. The switch allocation stage utilizes run-time arbiter selection. The crossbar stage utilizes a triple bypass bus. The proposed router is highly fault-tolerant compared with the existing state-of-the-art fault-tolerant routers. The reliability of the proposed router is 7.98 times higher than that of the unprotected baseline router in terms of the mean-time-to-failure metric. The silicon protection factor metric is used to calculate the protection ability of the proposed router. Consequently, it is confirmed that the proposed router has a greater protection ability than the conventional fault-tolerant routers.


2010 ◽  
Vol 97 (10) ◽  
pp. 1181-1192 ◽  
Author(s):  
Ashkan Eghbal ◽  
Pooria M. Yaghini ◽  
H. Pedram ◽  
H. R. Zarandi

2017 ◽  
Vol E100.D (4) ◽  
pp. 910-913
Author(s):  
Ruilian XIE ◽  
Jueping CAI ◽  
Xin XIN ◽  
Bo YANG

Electronics ◽  
2020 ◽  
Vol 9 (11) ◽  
pp. 1783 ◽  
Author(s):  
Ayaz Hussain ◽  
Muhammad Irfan ◽  
Naveed Khan Baloch ◽  
Umar Draz ◽  
Tariq Ali ◽  
...  

The router plays an important role in communication among different processing cores in on-chip networks. Technology scaling on one hand has enabled the designers to integrate multiple processing components on a single chip; on the other hand, it becomes the reason for faults. A generic router consists of the buffers and pipeline stages. A single fault may result in an undesirable situation of degraded performance or a whole chip may stop working. Therefore, it is necessary to provide permanent fault tolerance to all the components of the router. In this paper, we propose a mechanism that can tolerate permanent faults that occur in the router. We exploit the fault-tolerant techniques of resource sharing and paring between components for the input port unit and routing computation (RC) unit, the resource borrowing for virtual channel allocator (VA) and multiple paths for switch allocator (SA) and crossbar (XB). The experimental results and analysis show that the proposed mechanism enhances the reliability of the router architecture towards permanent faults at the cost of 29% area overhead. The proposed router architecture achieves the highest Silicon Protection Factor (SPF) metric, which is 24.8 as compared to the state-of-the-art fault-tolerant architectures. It incurs an increase in latency for SPLASH2 and PARSEC benchmark traffics, which is minimal as compared to the baseline router.


Author(s):  
Naveed Khan Baloch ◽  
Ayaz Hussain ◽  
Muhammad Iram Baig

The decreasing size of the transistor has increased the vulnerability towards faults. Increasing number of cores on a single chip has made the concept of Network on Chip (NoC) a standard communication backbone among cores. This facility comes with vulnerability of faults in the system due to decreasing size of transistors. A permanent fault in the network leads to undesirable consequence such as permanent blocking of flits or failure of the whole router. Preserving the router in the operational state has a significant impact on the reliability of the system. Permanent fault in buffers and pipeline stages of the router has a high impact on performance. The proposed router architecture Protector provides faults protection to both buffers and pipelines stages by exploiting the concepts of borrowing from other resources, using bypass paths and by creating multiple paths to reach output. The proposed router incurred an area overhead of 30% as compared to the baseline design. Reliability analysis using Silicon Protection Factor indicates that the proposed router has better fault tolerance efficiency as compared to state of the art. Latency analysis using PARSEC and SPLASH-2 benchmarks indicates proposed router incurs 13% and 16% latency overhead in the presence of faults.


2017 ◽  
Vol 26 (12) ◽  
pp. 1750200 ◽  
Author(s):  
Ruilian Xie ◽  
Jueping Cai ◽  
Peng Wang ◽  
Xin Zhang ◽  
Juan Wang

High reliability against undesirable effects is one of the key objectives in the design for Network-on-Chip (NoC). As a result, designing reliable and efficient routing method is highly desirable. This paper presents a novel turn model called NMad-y using one and two virtual channels along the [Formula: see text]- and [Formula: see text]-dimensions, respectively, and Adaptive and Fault-tolerant Routing Method (AFRM) which is designed based on the NMad-y turn model. AFRM can effectively tolerate multiple faulty routers and links in more complicated faulty situations by the link status of neighbor routers within two hops. AFRM is able to impose the reliability of network without losing the performance of network. Simulation results show that AFRM achieves better saturation throughput (0.83% on average) than a state-of-the-art fault-tolerant routing method and maintains high reliability of more than 97.43% on average.


Sign in / Sign up

Export Citation Format

Share Document