Self-Stabilizing Algorithms for Tree Metrics

We present algorithms for finding the diameter, centroid(s), and median(s) for tree structured networks subject to transient faults. In our solutions, the system reaches its final correct configuration in a finite time after the faults cease. The fault-tolerance is achieved using Dijkstra's paradigm of self-stabilization. A self-stabilizing algorithm, regardless of the initial system configuration, converges, in finite time, to a set of legitimate configurations.

Download Full-text

Self-Stabilizing Graph Coloring Algorithms

Advances in Systems Analysis, Software Engineering, and High Performance Computing - Distributed Computing Innovations for Business, Engineering, and Science ◽

10.4018/978-1-4666-2533-4.ch005 ◽

2013 ◽

pp. 96-110

Author(s):

Shing-Tsaan Huang ◽

Chi-Hung Tzeng ◽

Jehn-Ruey Jiang

Keyword(s):

Distributed Computing ◽

Graph Coloring ◽

Finite Time ◽

Planar Graphs ◽

Edge Coloring ◽

Vertex Coloring ◽

Transient Faults ◽

Initial State ◽

Self Stabilization ◽

Legitimate State

The concept of self-stabilization in distributed systems was introduced by Dijkstra in 1974. A system is said to be self-stabilizing if (1) it can converge in finite time to a legitimate state from any initial state, and (2) when it is in a legitimate state, it remains so henceforth. That is, a self-stabilizing system guarantees to converge to a legitimate state in finite time no matter what initial state it may start with; or, it can recover from transient faults automatically without any outside intervention. This chapter first introduces the self-stabilization concept in distributed computing. Next, it discusses the coloring problem on graphs and its applications in distributed computing. Then, it introduces three self-stabilizing algorithms. The first two are for vertex coloring and edge coloring on planar graphs, respectively. The last one is for edge coloring on bipartite graphs.

Download Full-text

Fault-tolerance and self-stabilization: impossibility results and solutions using self-stabilizing failure detectors

International Journal of Systems Science ◽

10.1080/00207729708929476 ◽

1997 ◽

Vol 28 (11) ◽

pp. 1177-1187 ◽

Cited By ~ 28

Author(s):

JOFFROY BEAUQUIER ◽

SYNNÖVE KEKKONEN-MONETA

Keyword(s):

Fault Tolerance ◽

Failure Detectors ◽

Impossibility Results ◽

Self Stabilization

Download Full-text

Proposal of an Adaptive Fault Tolerance Mechanism to Tolerate Intermittent Faults in RAM

Electronics ◽

10.3390/electronics9122074 ◽

2020 ◽

Vol 9 (12) ◽

pp. 2074

Author(s):

J.-Carlos Baraza-Calvo ◽

Joaquín Gracia-Morán ◽

Luis-J. Saiz-Adalid ◽

Daniel Gil-Tomás ◽

Pedro-J. Gil-Vicente

Keyword(s):

Fault Tolerance ◽

Error Correction ◽

Error Detection ◽

Fault Injection ◽

Error Correction Codes ◽

Transient Faults ◽

Tolerance Mechanism ◽

Intermittent Faults ◽

Risc Processor ◽

Simulation Based

Due to transistor shrinking, intermittent faults are a major concern in current digital systems. This work presents an adaptive fault tolerance mechanism based on error correction codes (ECC), able to modify its behavior when the error conditions change without increasing the redundancy. As a case example, we have designed a mechanism that can detect intermittent faults and swap from an initial generic ECC to a specific ECC capable of tolerating one intermittent fault. We have inserted the mechanism in the memory system of a 32-bit RISC processor and validated it by using VHDL simulation-based fault injection. We have used two (39, 32) codes: a single error correction–double error detection (SEC–DED) and a code developed by our research group, called EPB3932, capable of correcting single errors and double and triple adjacent errors that include a bit previously tagged as error-prone. The results of injecting transient, intermittent, and combinations of intermittent and transient faults show that the proposed mechanism works properly. As an example, the percentage of failures and latent errors is 0% when injecting a triple adjacent fault after an intermittent stuck-at fault. We have synthesized the adaptive fault tolerance mechanism proposed in two types of FPGAs: non-reconfigurable and partially reconfigurable. In both cases, the overhead introduced is affordable in terms of hardware, time and power consumption.

Download Full-text

Hybrid Fault Tolerance Techniques to Detect Transient Faults in Embedded Processors

10.1007/978-3-319-06340-9 ◽

2014 ◽

Cited By ~ 5

Author(s):

José Rodrigo Azambuja ◽

Fernanda Kastensmidt ◽

Jürgen Becker

Keyword(s):

Fault Tolerance ◽

Embedded Processors ◽

Transient Faults

Download Full-text

Adaptive fault-tolerance control based finite-time backstepping for hypersonic flight vehicle with full state constrains

Information Sciences ◽

10.1016/j.ins.2019.08.012 ◽

2020 ◽

Vol 507 ◽

pp. 53-66 ◽

Cited By ~ 5

Author(s):

Xuening Tang ◽

Ding Zhai ◽

Xiaojian Li

Keyword(s):

Fault Tolerance ◽

Finite Time ◽

Flight Vehicle ◽

Hypersonic Flight Vehicle ◽

Hypersonic Flight ◽

Full State ◽

Tolerance Control

Download Full-text

Survey on Fault Tolerance Startgies for Advance Microelectronics Chip

International Journal on Recent and Innovation Trends in Computing and Communication ◽

10.17762/ijritcc.v7i1.5217 ◽

2019 ◽

Vol 7 (1) ◽

pp. 01-04

Author(s):

Himanshu Shekhar, Prof. Deepa Gianchandani

Keyword(s):

Fault Tolerance ◽

Power Supply ◽

Fault Tolerant ◽

Full Adder ◽

Equipment Design ◽

Transient Faults ◽

Transient Fault

In the complex advance microelectronics based system, handling units are managing gadgets of littler size, which are delicate to the transient faults. A framework should be fabricated that will perceive the presence of faults and fuses strategies to will endure these faults without troublesome the typical activity A transient fault happens in a circuit caused by the electromagnetic commotions, astronomical beams, crosstalk and power supply clamor. It is extremely hard to recognize these faults amid disconnected testing. Subsequently a region effective fault tolerant full adder for testing and fixing of transient and changeless faults happened in single and multi-net is proposed. Furthermore, the proposed design can likewise identify and fix perpetual faults. This structure acquires much lower equipment overheads with respect to the conventional equipment design. In this paper, talk about various fault tolerant methodology for CMOS and ICs.

Download Full-text

Efficient fault tolerance: an approach to deal with transient faults in multiprocessor architectures

Proceedings of 1994 International Conference on Parallel and Distributed Systems ◽

10.1109/icpads.1994.590322 ◽

2002 ◽

Cited By ~ 2

Author(s):

A. Bondavalli ◽

S. Chiaradonna ◽

F. Di Giandomenico

Keyword(s):

Fault Tolerance ◽

Transient Faults ◽

Multiprocessor Architectures

Download Full-text

Fault Tolerance in Carbon Nanotube Transistors Based Multi Valued Logic

10.5772/intechopen.95361 ◽

2021 ◽

Author(s):

Gopalakrishnan Sundararajan

Keyword(s):

Fault Tolerance ◽

Error Correction ◽

Field Effect ◽

Fault Tolerant ◽

Field Effect Transistors ◽

Transient Faults ◽

Carbon Nanotube Transistors ◽

Nanotube Transistors ◽

Modular Redundancy ◽

Carbon Nano Tube

This Chapter presents a solution for fault-tolerance in Multi-Valued Logic (MVL) circuits comprised of Carbon Nano-Tube Field Effect Transistors (CNTFET). This chapter reviews basic primitives of MVL and describes ternary implementations of CNTFET circuits. Finally, this chapter describes a method for error correction called Restorative Feedback (RFB). The RFB method is a variant of Triple-Modular Redundancy (TMR) that utilizes the fault masking capabilities of the Muller C element to provide added protection against noisy transient faults. Fault tolerant properties of Muller C element is discussed and error correction capability of RFB method is demonstrated in detail.

Download Full-text

A Self-Checking Hardware Journal for a Fault-Tolerant Processor Architecture

International Journal of Reconfigurable Computing ◽

10.1155/2011/962062 ◽

2011 ◽

Vol 2011 ◽

pp. 1-15 ◽

Cited By ~ 3

Author(s):

Mohsin Amin ◽

Abbas Ramazani ◽

Fabrice Monteiro ◽

Camille Diou ◽

Abbas Dandache

Keyword(s):

Fault Tolerance ◽

Error Detection ◽

Error Control ◽

Fault Tolerant ◽

Error Rates ◽

Main Memory ◽

Transient Faults ◽

Processor Core ◽

Detection Techniques ◽

Performance Area

We introduce a specialized self-checking hardware journal being used as a centerpiece in our design strategy to build a processor tolerant to transient faults. Fault tolerance here relies on the use of error detection techniques in the processor core together with journalization and rollback execution to recover from erroneous situations. Effective rollback recovery is possible thanks to using a hardware journal and chosing a stack computing architecture for the processor core instead of the usual RISC or CISC. The main objective of the journalization and the hardware self-checking journal is to prevent data not yet validated to be sent to the main memory, and allow to fast rollback execution on faulty situations. The main memory, supposed to be fault secure in our model, only contains valid (uncorrupted) data obtained from fault-free computations. Error control coding techniques are used both in the processor core to detect errors and in the HW journal to protect the temporarily stored data from possible changes induced by transient faults. Implementation results on an FPGA of the Altera Stratix-II family show clearly the relevance of the approach, both in terms of performance/area tradeoff and fault tolerance effectiveness, even for high error rates.

Download Full-text

Finite-Time Fault Estimator Based Fault-Tolerance Control for a Surface Vehicle With Input Saturations

IEEE Transactions on Industrial Informatics ◽

10.1109/tii.2019.2930471 ◽

2020 ◽

Vol 16 (2) ◽

pp. 1172-1181 ◽

Cited By ~ 25

Author(s):

Ning Wang ◽

Zhongchao Deng

Keyword(s):

Fault Tolerance ◽

Finite Time ◽

Tolerance Control

Download Full-text