Towards a New Semantic Metric for Error Detection Based on Program State Redundancy

Author(s):  
Dalila Amara Amara ◽  
Latifa Ben Arfa Rabai

Fault tolerance techniques are generally based around a common concept that is redundancy whose measurement is required. A suite of four semantic metrics is proposed to assess program redundancy and reflect their ability to tolerate faults. Literature shows that one of these metrics, namely state redundancy, is limited to compute program redundancy only in their initial and final states and ignores their internal states. Consequently, the authors focus in this paper to overcome this shortcoming by proposing a new redundancy-based semantic metric that computes the redundancy of the different program states including internal ones. The empirical study they perform shows that the proposed metric is a measure of program redundancy in one side and is an error detection indicator in another side. Moreover, they demonstrate that it is more accurate than the basic state redundancy metric in detecting masked errors. It is useful for testers to indicate if a tested program is error-free and to pinpoint the presence of masked errors even if the final states are equal to the expected ones.

1990 ◽  
Vol 16 (4) ◽  
pp. 432-443 ◽  
Author(s):  
N.G. Leveson ◽  
S.S. Cha ◽  
J.C. Knight ◽  
T.J. Shimeall

Electronics ◽  
2020 ◽  
Vol 9 (12) ◽  
pp. 2074
Author(s):  
J.-Carlos Baraza-Calvo ◽  
Joaquín Gracia-Morán ◽  
Luis-J. Saiz-Adalid ◽  
Daniel Gil-Tomás ◽  
Pedro-J. Gil-Vicente

Due to transistor shrinking, intermittent faults are a major concern in current digital systems. This work presents an adaptive fault tolerance mechanism based on error correction codes (ECC), able to modify its behavior when the error conditions change without increasing the redundancy. As a case example, we have designed a mechanism that can detect intermittent faults and swap from an initial generic ECC to a specific ECC capable of tolerating one intermittent fault. We have inserted the mechanism in the memory system of a 32-bit RISC processor and validated it by using VHDL simulation-based fault injection. We have used two (39, 32) codes: a single error correction–double error detection (SEC–DED) and a code developed by our research group, called EPB3932, capable of correcting single errors and double and triple adjacent errors that include a bit previously tagged as error-prone. The results of injecting transient, intermittent, and combinations of intermittent and transient faults show that the proposed mechanism works properly. As an example, the percentage of failures and latent errors is 0% when injecting a triple adjacent fault after an intermittent stuck-at fault. We have synthesized the adaptive fault tolerance mechanism proposed in two types of FPGAs: non-reconfigurable and partially reconfigurable. In both cases, the overhead introduced is affordable in terms of hardware, time and power consumption.


1992 ◽  
Vol 02 (03) ◽  
pp. 281-304
Author(s):  
SANJAY P. POPLI ◽  
MAGDY A. BAYOUMI ◽  
AKASH TYAGI

Real-time digital signal processing (DSP) applications require high performance parallel architectures that are also reliable. VLSI arrays are good candidates for providing the required high throughput for these applications. These arrays which consist of a number of regularly interconnected processing elements (PEs) will not function correctly in the presence of even a single fault in any of the PEs. Fault tolerance has therefore become a vital design criterion for VLSI arrays. In this paper, a fault tolerance strategy for VLSI arrays is proposed, which significantly improves the reliability of the system. The fault tolerance scheme is composed of two phases: testing and locating faults (fault detection and diagnosis), and reconfiguration. The first phase employs an on-line error detection technique which achieves a compromise between the space and time redundancy approaches. This concurrent error detection technique reduces the rollback time considerably. The reconfiguration phase is achieved by using a global control responsible for changing the states of the switches in the interconnection network. Backtracking is introduced into the algorithm for maximizing the processor utilization, at the same time keeping the complexity of the interconnection network as simple as possible. Finally, a reliability analysis of this scheme using a Markov model and a comparison with some previous schemes are given.


Author(s):  
Jon Bellona ◽  
Lin Bai ◽  
Luke Dahl ◽  
Amy LaViers

Since people often communicate internal states and intentions through movement, robots can better interact with humans if they too can modify their movements to communicate changing state. These movements, which may be seen as supplementary to those required for workspace tasks, may be termed “expressive.” However, robot hardware, which cannot recreate the same range of dynamics as human limbs, often limit expressive capacity. One solution is to augment expressive robotic movement with expressive sound. To that end, this paper presents an application for synthesizing sounds that match various movement qualities. Its design is based on an empirical study analyzing sound and movement qualities, where movement qualities are parametrized according to Laban’s Effort System. Our results suggests a number of correspondences between movement qualities and sound qualities. These correspondences are presented here and discussed within the context of designing movement-quality-to-sound-quality mappings in our sound synthesis application. This application will be used in future work testing user perceptions of expressive movements with synchronous sounds.


2007 ◽  
Vol 10 (2) ◽  
Author(s):  
Goutam Kumar Saha

The term “Self-healing” denotes the capability of a software system in dealing with bugs. Fault tolerance for dependable computing is to provide the specified service through rigorous design whereas self-healing is meant for run-time issues. The paper describes various issues on designing a self-healing software application system that relies on the on-the-fly error detection and repair of web application or service agent code and data. Self-Healing is a very new area of research that deals with fault tolerance for dynamic systems. Self-healing deals with imprecise specification, uncontrolled environment and reconfiguration of system according to its dynamics. Software, which is capable of detecting and reacting to its malfunctions, is called self-healing software. Such software system has the ability to examine its failures and to take appropriate corrections. Self-Healing system must have knowledge about its expected behavior in order to examine whether its actual behavior deviates from its expected behavior in relation of the environment. A fault-model of Self-Healing system is to state what faults or injuries to be self-healed including fault duration, fault source such as, operational errors, defective system requirements or implementation errors etc. Self-healing categories of aspects include fault-model or fault hypothesis, System-response, System-completeness and Design-context. Based on many important literatures, this paper aims also to illustrate critical points of the emergent research topic of Self – Healing Software System.


2011 ◽  
Vol 2011 ◽  
pp. 1-15 ◽  
Author(s):  
Mohsin Amin ◽  
Abbas Ramazani ◽  
Fabrice Monteiro ◽  
Camille Diou ◽  
Abbas Dandache

We introduce a specialized self-checking hardware journal being used as a centerpiece in our design strategy to build a processor tolerant to transient faults. Fault tolerance here relies on the use of error detection techniques in the processor core together with journalization and rollback execution to recover from erroneous situations. Effective rollback recovery is possible thanks to using a hardware journal and chosing a stack computing architecture for the processor core instead of the usual RISC or CISC. The main objective of the journalization and the hardware self-checking journal is to prevent data not yet validated to be sent to the main memory, and allow to fast rollback execution on faulty situations. The main memory, supposed to be fault secure in our model, only contains valid (uncorrupted) data obtained from fault-free computations. Error control coding techniques are used both in the processor core to detect errors and in the HW journal to protect the temporarily stored data from possible changes induced by transient faults. Implementation results on an FPGA of the Altera Stratix-II family show clearly the relevance of the approach, both in terms of performance/area tradeoff and fault tolerance effectiveness, even for high error rates.


Sign in / Sign up

Export Citation Format

Share Document