scholarly journals Proposal of an Adaptive Fault Tolerance Mechanism to Tolerate Intermittent Faults in RAM

Electronics ◽  
2020 ◽  
Vol 9 (12) ◽  
pp. 2074
Author(s):  
J.-Carlos Baraza-Calvo ◽  
Joaquín Gracia-Morán ◽  
Luis-J. Saiz-Adalid ◽  
Daniel Gil-Tomás ◽  
Pedro-J. Gil-Vicente

Due to transistor shrinking, intermittent faults are a major concern in current digital systems. This work presents an adaptive fault tolerance mechanism based on error correction codes (ECC), able to modify its behavior when the error conditions change without increasing the redundancy. As a case example, we have designed a mechanism that can detect intermittent faults and swap from an initial generic ECC to a specific ECC capable of tolerating one intermittent fault. We have inserted the mechanism in the memory system of a 32-bit RISC processor and validated it by using VHDL simulation-based fault injection. We have used two (39, 32) codes: a single error correction–double error detection (SEC–DED) and a code developed by our research group, called EPB3932, capable of correcting single errors and double and triple adjacent errors that include a bit previously tagged as error-prone. The results of injecting transient, intermittent, and combinations of intermittent and transient faults show that the proposed mechanism works properly. As an example, the percentage of failures and latent errors is 0% when injecting a triple adjacent fault after an intermittent stuck-at fault. We have synthesized the adaptive fault tolerance mechanism proposed in two types of FPGAs: non-reconfigurable and partially reconfigurable. In both cases, the overhead introduced is affordable in terms of hardware, time and power consumption.

Author(s):  
Luis-J. Saiz-Adalid ◽  
Pedro Gil ◽  
Juan-Carlos Ruiz ◽  
Joaquin Gracia-Moran ◽  
Daniel Gil-Tomas ◽  
...  

2019 ◽  
Vol 8 (2S8) ◽  
pp. 1948-1952

The developments in IC technology and rapid increase of transistor densities and scaling factor, the use of ECC’s acquired prominence. Multiple bit errors in memories due to technology scaling demands advanced error correction codes. SEC-DEC, DEC, burst error detection, Golay code, Reed Solmon codes etc. have much decoding complexity and latency. The above drawbacks can be reduced with OLS codes. OLS codes with majority logic decoding technique, modular construction and simple decoding mechanisms it enables low delay improvements. MBU’S can be addressed using OLS-MLD codes. This paper presents a detail study of developments in multibit ECC’s using OLS-MLD mechanism


2007 ◽  
Vol 2 (1) ◽  
pp. 14-21
Author(s):  
Carlos R. Moratelli ◽  
Érika Cota ◽  
Marcelo S. Lubaszewski

This work describes a hardware approach for the concurrent fault detection and error correction in a cryptographic core. It has been shown in the literature that transient faults injected in a cryptographic core can lead to the revelation of the encryption key using quite inexpensive equipments. This kind of attack is a real threat to tamper resistant devices like Smart Cards. To tackle such attacks, the cryptographic core must be immune to transient faults. In this work the DES algorithm is taken as a vulnerable cryptosystem case study.We show how an attack against DES is performed through a fault injection campaign. Then, a countermeasure based on partial hardware replication is proposed and applied to DES. Experimental results show the efficiency of the proposed scheme to protect DES against DFA fault attacks. Furthermore, the proposed solution is independent of implementation, and can be applied to other cryptographic algorithms, such as AES.


2021 ◽  
Author(s):  
Gopalakrishnan Sundararajan

This Chapter presents a solution for fault-tolerance in Multi-Valued Logic (MVL) circuits comprised of Carbon Nano-Tube Field Effect Transistors (CNTFET). This chapter reviews basic primitives of MVL and describes ternary implementations of CNTFET circuits. Finally, this chapter describes a method for error correction called Restorative Feedback (RFB). The RFB method is a variant of Triple-Modular Redundancy (TMR) that utilizes the fault masking capabilities of the Muller C element to provide added protection against noisy transient faults. Fault tolerant properties of Muller C element is discussed and error correction capability of RFB method is demonstrated in detail.


2011 ◽  
Vol 2011 ◽  
pp. 1-15 ◽  
Author(s):  
Mohsin Amin ◽  
Abbas Ramazani ◽  
Fabrice Monteiro ◽  
Camille Diou ◽  
Abbas Dandache

We introduce a specialized self-checking hardware journal being used as a centerpiece in our design strategy to build a processor tolerant to transient faults. Fault tolerance here relies on the use of error detection techniques in the processor core together with journalization and rollback execution to recover from erroneous situations. Effective rollback recovery is possible thanks to using a hardware journal and chosing a stack computing architecture for the processor core instead of the usual RISC or CISC. The main objective of the journalization and the hardware self-checking journal is to prevent data not yet validated to be sent to the main memory, and allow to fast rollback execution on faulty situations. The main memory, supposed to be fault secure in our model, only contains valid (uncorrupted) data obtained from fault-free computations. Error control coding techniques are used both in the processor core to detect errors and in the HW journal to protect the temporarily stored data from possible changes induced by transient faults. Implementation results on an FPGA of the Altera Stratix-II family show clearly the relevance of the approach, both in terms of performance/area tradeoff and fault tolerance effectiveness, even for high error rates.


Author(s):  
Jagannath Samanta ◽  
Akash Kewat

Recently, there have been continuous rising interests of multi-bit error correction codes (ECCs) for protecting memory cells from soft errors which may also enhance the reliability of memory systems. The single error correction and double error detection (SEC-DED) codes are generally employed in many high-speed memory systems. In this paper, Hsiao-based SEC-DED codes are optimized based on two proposed optimization algorithms employed in parity check matrix and error correction logic. Theoretical area complexity of SEC-DED codecs require maximum 49.29%, 18.64% and 49.21% lesser compared to the Hsiao codes [M. Y. Hsiao, A class of optimal minimum odd-weight-column SEC-DED codes, IBM J. Res. Dev. 14 (1970) 395–401], Reviriego et al. codes [P. Reviriego, S. Pontarelli, J. A. Maestro and M. Ottavi, A method to construct low delay single error correction codes for protecting data bits only, IEEE Trans. Comput.-Aided Des. Integr. Circuits Syst. 32 (2013) 479–483] and Liu et al. codes [S. Liu, P. Reviriego, L. Xiao and J. A. Maestro, A method to recover critical bits under a double error in SEC-DED protected memories, Microelectron. Reliab. 73 (2017) 92–96], respectively. Proposed codec is designed and implemented both in field programmable gate array (FPGA) and ASIC platforms. The synthesized SEC-DED codecs need 31.14% lesser LUTs than the original Hsiao code. Optimized codec is faster than the existing related codec without affecting its power consumption. These compact and faster SEC-DED codecs are employed in cache memory to enhance the reliability.


2015 ◽  
Vol 28 (3) ◽  
pp. 309-323 ◽  
Author(s):  
Navid Khoshavi ◽  
Hamid Zarandi ◽  
Mohammad Maghsoudloo

This paper presents two control-flow error recovery techniques, CFE Recovery using Data-flow graph Consideration and CFE Recovery using Macro block-level Check pointing. These techniques are proposed with regards to thread interactions in the programs. These techniques try to moderate the high memory and performance overheads of conventional control-flow checking techniques. The proposed recovery techniques are composed of two phases of control-flow error detection and recovery. These phases are designed by means of inserting additional instructions into program at compile time considering dependency graph, extracted from control-flow and data-flow dependencies among basic blocks and thread interactions in the programs. In order to evaluate the proposed techniques, five multithreaded benchmarks are utilized to run on a multi-core processor. Moreover, a total of 10000 transient faults have been injected into several executable points of each program. Fault injection experiments show that the proposed techniques recover the detected errors at-least for 91% of the cases.


Sign in / Sign up

Export Citation Format

Share Document