Reducing Resource Redundancy for Concurrent Error Detection Techniques in High Performance Microprocessors

Author(s):  
S. Kumar ◽  
A. Aggarwal
1992 ◽  
Vol 02 (03) ◽  
pp. 281-304
Author(s):  
SANJAY P. POPLI ◽  
MAGDY A. BAYOUMI ◽  
AKASH TYAGI

Real-time digital signal processing (DSP) applications require high performance parallel architectures that are also reliable. VLSI arrays are good candidates for providing the required high throughput for these applications. These arrays which consist of a number of regularly interconnected processing elements (PEs) will not function correctly in the presence of even a single fault in any of the PEs. Fault tolerance has therefore become a vital design criterion for VLSI arrays. In this paper, a fault tolerance strategy for VLSI arrays is proposed, which significantly improves the reliability of the system. The fault tolerance scheme is composed of two phases: testing and locating faults (fault detection and diagnosis), and reconfiguration. The first phase employs an on-line error detection technique which achieves a compromise between the space and time redundancy approaches. This concurrent error detection technique reduces the rollback time considerably. The reconfiguration phase is achieved by using a global control responsible for changing the states of the switches in the interconnection network. Backtracking is introduced into the algorithm for maximizing the processor utilization, at the same time keeping the complexity of the interconnection network as simple as possible. Finally, a reliability analysis of this scheme using a Markov model and a comparison with some previous schemes are given.


Sign in / Sign up

Export Citation Format

Share Document