scholarly journals SOFTWARE IMPLEMENTED HARDWARE-TRANSIENT FAULTS DETECTION

2014 ◽  
pp. 26-30
Author(s):  
Goutam Kumar Saha

This paper examines a software implemented self-checking technique that is capable of detecting processorregisters' hardware-transient faults. The proposed approach is intended to detect run-time transient bit-errors in memory and processor status register. Error correction is not considered here. However, this low-cost approach is intended to be adopted in commodity systems that use ordinary off-the-shelf microprocessors, for the purpose of operational faults detection towards gaining fail-safe kind of fault tolerant system.

2021 ◽  
Author(s):  
Gopalakrishnan Sundararajan

This Chapter presents a solution for fault-tolerance in Multi-Valued Logic (MVL) circuits comprised of Carbon Nano-Tube Field Effect Transistors (CNTFET). This chapter reviews basic primitives of MVL and describes ternary implementations of CNTFET circuits. Finally, this chapter describes a method for error correction called Restorative Feedback (RFB). The RFB method is a variant of Triple-Modular Redundancy (TMR) that utilizes the fault masking capabilities of the Muller C element to provide added protection against noisy transient faults. Fault tolerant properties of Muller C element is discussed and error correction capability of RFB method is demonstrated in detail.


2020 ◽  
Vol 26 (4) ◽  
pp. 307-323
Author(s):  
Chakib Nehnouh

The Network-on-Chip (NoC) has become a promising communication infrastructure for Multiprocessors-System-on-Chip (MPSoC). Reliability is a main concern in NoC and performance is degraded when NoC is susceptible to faults. A fault can be determined as a cause of deviation from the desired operation of the system (error). To deal with these reliability challenges, this work propose OFDIM (Online Fault Detection and Isolation Mechanism),a novel combined methodology to tolerate multiple permanent and transient faults. The new router architecture uses two modules to assure highly reliable and low-cost fault-tolerant strategy. In contrast to existing works, our architecture presents less area, more fault tolerance, and high reliability. The reliability comparison using Silicon Protection Factor (SPF), shows 22-time improvement and that additional circuitry incurs an area overhead of 27%, which is better than state-of-the-art reliable router architectures. Also, the results show that the throughput decreases only by 5.19% and minor increase in average latency 2.40% while providing high reliability.


1996 ◽  
Vol 45 (2) ◽  
pp. 332-340 ◽  
Author(s):  
G. Muller ◽  
M. Banatre ◽  
N. Peyrouze ◽  
B. Rochat

Author(s):  
Hana Kubatova ◽  
Pavel Kubalik

The main aim of this chapter is to present the way, how to design fault-tolerant or fail-safe systems in programmable hardware (FPGAs) and therefore to use FPGAs in mission-critical applications, too. RAM based FPGAs are usually taken for unreliable due to high probability of transient faults (SEU) and therefore inapplicable in this area. But FPGAs can be easily reconfigured. The authors’ aim is to utilize appropriate type of FPGA reconfiguration and to combine it with well-known methods for fail-safe and fault-tolerant design (duplex, TMR) including on-line testing methods for fault detection and then startup of the reconfiguration process. Dependability parameters’ calculations based on reliability models is integral part of proposed methodology. The trade-off between the requested level of dependability characteristics of a designed system and area overhead with respect to FPGA possible faults the main property and advantage of proposed methodology.


2013 ◽  
pp. 695-714
Author(s):  
Hana Kubatova ◽  
Pavel Kubalik

The main aim of this chapter is to present the way, how to design fault-tolerant or fail-safe systems in programmable hardware (FPGAs) and therefore to use FPGAs in mission-critical applications, too. RAM based FPGAs are usually taken for unreliable due to high probability of transient faults (SEU) and therefore inapplicable in this area. But FPGAs can be easily reconfigured. The authors’ aim is to utilize appropriate type of FPGA reconfiguration and to combine it with well-known methods for fail-safe and fault-tolerant design (duplex, TMR) including on-line testing methods for fault detection and then startup of the reconfiguration process. Dependability parameters’ calculations based on reliability models is integral part of proposed methodology. The trade-off between the requested level of dependability characteristics of a designed system and area overhead with respect to FPGA possible faults the main property and advantage of proposed methodology.


2006 ◽  
Vol 9 (2) ◽  
Author(s):  
Goutam Kumar Saha

This paper describes a single-version algorithmic approach to design in fault tolerant computing in various computing systems by using static redundancy in order to mask transient bit errors in processor-memory and registers. This low-cost single-version scheme relies on a time redundancy approach. This software scheme does not intend to tolerate software design bugs. Instead of using multiple and independent versions of an application program, this single-version approach uses multiple copies of an application program. This low-cost approach is useful to tolerate various malicious code modifications and transient-faults during the run time of a computing application system without incurring any additional cost for extra hardware and extra software versions as an N-version programming scheme (NVP) or a Recovery block scheme (RBS). This proposed model is a practical and usable one that demands an affordable redundancy in time and space. The proposed scheme is capable of tolerating various operational faults that might occur during the execution time of an application.


Nature ◽  
2021 ◽  
Vol 595 (7867) ◽  
pp. 383-387
Author(s):  
◽  
Zijun Chen ◽  
Kevin J. Satzinger ◽  
Juan Atalaya ◽  
Alexander N. Korotkov ◽  
...  

AbstractRealizing the potential of quantum computing requires sufficiently low logical error rates1. Many applications call for error rates as low as 10−15 (refs. 2–9), but state-of-the-art quantum platforms typically have physical error rates near 10−3 (refs. 10–14). Quantum error correction15–17 promises to bridge this divide by distributing quantum logical information across many physical qubits in such a way that errors can be detected and corrected. Errors on the encoded logical qubit state can be exponentially suppressed as the number of physical qubits grows, provided that the physical error rates are below a certain threshold and stable over the course of a computation. Here we implement one-dimensional repetition codes embedded in a two-dimensional grid of superconducting qubits that demonstrate exponential suppression of bit-flip or phase-flip errors, reducing logical error per round more than 100-fold when increasing the number of qubits from 5 to 21. Crucially, this error suppression is stable over 50 rounds of error correction. We also introduce a method for analysing error correlations with high precision, allowing us to characterize error locality while performing quantum error correction. Finally, we perform error detection with a small logical qubit using the 2D surface code on the same device18,19 and show that the results from both one- and two-dimensional codes agree with numerical simulations that use a simple depolarizing error model. These experimental demonstrations provide a foundation for building a scalable fault-tolerant quantum computer with superconducting qubits.


Sign in / Sign up

Export Citation Format

Share Document