MobileRE: A replicas prioritized hybrid fault tolerance strategy for mobile distributed system

2021 ◽  
pp. 102217
Author(s):  
Yu Wu ◽  
Duo Liu ◽  
Xianzhang Chen ◽  
Jinting Ren ◽  
Renping Liu ◽  
...  
1992 ◽  
Vol 02 (03) ◽  
pp. 281-304
Author(s):  
SANJAY P. POPLI ◽  
MAGDY A. BAYOUMI ◽  
AKASH TYAGI

Real-time digital signal processing (DSP) applications require high performance parallel architectures that are also reliable. VLSI arrays are good candidates for providing the required high throughput for these applications. These arrays which consist of a number of regularly interconnected processing elements (PEs) will not function correctly in the presence of even a single fault in any of the PEs. Fault tolerance has therefore become a vital design criterion for VLSI arrays. In this paper, a fault tolerance strategy for VLSI arrays is proposed, which significantly improves the reliability of the system. The fault tolerance scheme is composed of two phases: testing and locating faults (fault detection and diagnosis), and reconfiguration. The first phase employs an on-line error detection technique which achieves a compromise between the space and time redundancy approaches. This concurrent error detection technique reduces the rollback time considerably. The reconfiguration phase is achieved by using a global control responsible for changing the states of the switches in the interconnection network. Backtracking is introduced into the algorithm for maximizing the processor utilization, at the same time keeping the complexity of the interconnection network as simple as possible. Finally, a reliability analysis of this scheme using a Markov model and a comparison with some previous schemes are given.


2014 ◽  
pp. 92-99
Author(s):  
N. P. Gopalan ◽  
K. Nagarajan

Checkpointing mechanism is the one of the best attractive approach for providing software fault tolerance in distributed message passing systems. This paper aims to implement a distributed checkpointing technique, which eliminates the drawbacks of the centralized approach like “domino effect”, “useless checkpoint” (checkpoints that do not contribute to global consistency), and “hidden and zigzag” dependencies. The proposed checkpointing protocol has a checkpoint initiator, but, coordination among the local checkpoints is done in a distributed fashion. This guaranty that no message would be lost in case of failure occurs, has been maintained in this work by exchange of information among the processes. However, there is no central checkpoint initiator, but each of the processes takes turn to act as an initiator. Processes take local checkpoints only after being notified by the initiator. The processes synchronize their activities of the current checkpointing interval before finally committing their checkpoints. Thus, the checkpointing pattern described in this paper takes only those checkpoints that will contribute to the consistent global snapshot thereby eliminating the number of useless checkpoints.


2014 ◽  
Vol 511-512 ◽  
pp. 1012-1016 ◽  
Author(s):  
Zhi Qiang Wang ◽  
Xiao Long Li ◽  
Qing Zhen Wang

For the failure of current sensor on maglev train, an active fault tolerance control strategy based on feedback gain reconfiguration is proposed. Fault diagnosis unit based on state observer is designed to detect the output of current sensor, the diagnosis result is used to switch the control strategy. Simulation result indicates that the fault tolerance strategy meets the demands of the system.


Sign in / Sign up

Export Citation Format

Share Document