scholarly journals Susceptible workload driven selective fault tolerance using a probabilistic fault model

Author(s):  
Mauricio D. Gutierrez ◽  
Vasileios Tenentes ◽  
Tom J. Kazmierski
Keyword(s):  
2007 ◽  
Vol 10 (2) ◽  
Author(s):  
Goutam Kumar Saha

The term “Self-healing” denotes the capability of a software system in dealing with bugs. Fault tolerance for dependable computing is to provide the specified service through rigorous design whereas self-healing is meant for run-time issues. The paper describes various issues on designing a self-healing software application system that relies on the on-the-fly error detection and repair of web application or service agent code and data. Self-Healing is a very new area of research that deals with fault tolerance for dynamic systems. Self-healing deals with imprecise specification, uncontrolled environment and reconfiguration of system according to its dynamics. Software, which is capable of detecting and reacting to its malfunctions, is called self-healing software. Such software system has the ability to examine its failures and to take appropriate corrections. Self-Healing system must have knowledge about its expected behavior in order to examine whether its actual behavior deviates from its expected behavior in relation of the environment. A fault-model of Self-Healing system is to state what faults or injuries to be self-healed including fault duration, fault source such as, operational errors, defective system requirements or implementation errors etc. Self-healing categories of aspects include fault-model or fault hypothesis, System-response, System-completeness and Design-context. Based on many important literatures, this paper aims also to illustrate critical points of the emergent research topic of Self – Healing Software System.


2021 ◽  
Author(s):  
Raha Abedi

One of the main goals of fault injection techniques is to evaluate the fault tolerance of a design. To have greater confidence in the fault tolerance of a system, an accurate fault model is essential. While more accurate than gate level, transistor level fault models cannot be synthesized into FPGA chips. Thus, transistor level faults must be mapped to the gate level to obtain both accuracy and synthesizability. Re-synthesizing a large system for fault injection is not cost effective when the number of faults and system complexity are high. Therefore, the system must be divided into partitions to reduce the re-synthesis time as faults are injected only into a portion of the system. However, the module-based partial reconfiguration complexity rises with an increase in the total number of partitions in the system. An unbalanced partitioning methodology is introduced to reduce the total number of partitions in a system while the size of the partitions where faults are to be injected remains small enough to achieve an acceptable re-synthesis time.


VLSI Design ◽  
2007 ◽  
Vol 2007 ◽  
pp. 1-13 ◽  
Author(s):  
Teijo Lehtonen ◽  
Pasi Liljeberg ◽  
Juha Plosila

We propose link structures for NoC that have properties for tolerating efficiently transient, intermittent, and permanent errors. This is a necessary step to be taken in order to implement reliable systems in future nanoscale technologies. The protection against transient errors is realized using Hamming coding and interleaving for error detection and retransmission as the recovery method. We introduce two approaches for tackling the intermittent and permanent errors. In the first approach, spare wires are introduced together with reconfiguration circuitry. The other approach uses time redundancy, the transmission is split into two parts, where the data is doubled. In both structures the presence of permanent or intermittent errors is monitored by analyzing previous error syndromes. The links are based on self-timed signaling in which the handshake signals are protected using triple modular redundancy. We present the structures, operation, and designs for the different components of the links. The fault tolerance properties are analyzed using a fault model containing temporary, intermittent, and permanent faults that occur both as bursts and as single faults. The results show a considerable enhancement in the fault tolerance at the cost of performance and area, and with only a slight increase in power consumption.


2021 ◽  
Author(s):  
Raha Abedi

One of the main goals of fault injection techniques is to evaluate the fault tolerance of a design. To have greater confidence in the fault tolerance of a system, an accurate fault model is essential. While more accurate than gate level, transistor level fault models cannot be synthesized into FPGA chips. Thus, transistor level faults must be mapped to the gate level to obtain both accuracy and synthesizability. Re-synthesizing a large system for fault injection is not cost effective when the number of faults and system complexity are high. Therefore, the system must be divided into partitions to reduce the re-synthesis time as faults are injected only into a portion of the system. However, the module-based partial reconfiguration complexity rises with an increase in the total number of partitions in the system. An unbalanced partitioning methodology is introduced to reduce the total number of partitions in a system while the size of the partitions where faults are to be injected remains small enough to achieve an acceptable re-synthesis time.


1995 ◽  
Vol 06 (04) ◽  
pp. 401-416 ◽  
Author(s):  
PETER J. EDWARDS ◽  
ALAN F. MURRAY

This paper investigates fault tolerance in feedforward neural networks, for a realistic fault model based on analog hardware. In our previous work with synaptic weight noise26 we showed significant fault tolerance enhancement over standard training algorithms. We proposed that when introduced into training, weight noise distributes the network computation more evenly across the weights and thus enhances fault tolerance. Here we compare those results with an approximation to the mechanisms induced by stochastic weight noise, incorporated into training deterministically via penalty terms. The penalty terms are an approximation to weight saliency and therefore, in addition, we assess a number of other weight saliency measures and perform comparison experiments. The results show that the first term approximation is an incomplete model of weight noise in terms of fault tolerance. Also the error Hessian is shown to be the most accurate measure of weight saliency.


Author(s):  
M. Chaitanya ◽  
K. Durga Charan

Load balancing makes cloud computing greater knowledgeable and could increase client pleasure. At reward cloud computing is among the all most systems which offer garage of expertise in very lowers charge and available all the time over the net. However, it has extra vital hassle like security, load administration and fault tolerance. Load balancing inside the cloud computing surroundings has a large impact at the presentation. The set of regulations relates the sport idea to the load balancing manner to amplify the abilties in the public cloud environment. This textual content pronounces an extended load balance mannequin for the majority cloud concentrated on the cloud segregating proposal with a swap mechanism to select specific strategies for great occasions.


Author(s):  
Rugui Yao ◽  
Fanqi Gao ◽  
Ling Wang ◽  
Yinghui Wang
Keyword(s):  

Sign in / Sign up

Export Citation Format

Share Document