Proposal of an Adaptive Fault Tolerance Mechanism to Tolerate Intermittent Faults in RAM

J.-Carlos Baraza-Calvo; Joaquín Gracia-Morán; Luis-J. Saiz-Adalid; Daniel Gil-Tomás; Pedro-J. Gil-Vicente

doi:10.3390/electronics9122074

Proposal of an Adaptive Fault Tolerance Mechanism to Tolerate Intermittent Faults in RAM

Electronics ◽

10.3390/electronics9122074 ◽

2020 ◽

Vol 9 (12) ◽

pp. 2074

Author(s):

J.-Carlos Baraza-Calvo ◽

Joaquín Gracia-Morán ◽

Luis-J. Saiz-Adalid ◽

Daniel Gil-Tomás ◽

Pedro-J. Gil-Vicente

Keyword(s):

Fault Tolerance ◽

Error Correction ◽

Error Detection ◽

Fault Injection ◽

Error Correction Codes ◽

Transient Faults ◽

Tolerance Mechanism ◽

Intermittent Faults ◽

Risc Processor ◽

Simulation Based

Due to transistor shrinking, intermittent faults are a major concern in current digital systems. This work presents an adaptive fault tolerance mechanism based on error correction codes (ECC), able to modify its behavior when the error conditions change without increasing the redundancy. As a case example, we have designed a mechanism that can detect intermittent faults and swap from an initial generic ECC to a specific ECC capable of tolerating one intermittent fault. We have inserted the mechanism in the memory system of a 32-bit RISC processor and validated it by using VHDL simulation-based fault injection. We have used two (39, 32) codes: a single error correction–double error detection (SEC–DED) and a code developed by our research group, called EPB3932, capable of correcting single errors and double and triple adjacent errors that include a bit previously tagged as error-prone. The results of injecting transient, intermittent, and combinations of intermittent and transient faults show that the proposed mechanism works properly. As an example, the percentage of failures and latent errors is 0% when injecting a triple adjacent fault after an intermittent stuck-at fault. We have synthesized the adaptive fault tolerance mechanism proposed in two types of FPGAs: non-reconfigurable and partially reconfigurable. In both cases, the overhead introduced is affordable in terms of hardware, time and power consumption.

Download Full-text

Efficiency Estimation of Single Error Correction, Double Error Detection and Double-Adjacent-Error Correction Codes

Advances in Intelligent Systems and Computing - Applied Informatics and Cybernetics in Intelligent Systems ◽

10.1007/978-3-030-51974-2_48 ◽

2020 ◽

pp. 518-525

Author(s):

N. D. Kustov ◽

E. S. Lepeshkina ◽

V. Kh. Khanov

Keyword(s):

Error Correction ◽

Error Detection ◽

Error Correction Codes ◽

Single Error ◽

Efficiency Estimation

Download Full-text

Power Series Representation Op logical Functions and its Applications to Error Detection and Error Correction Codes.(Dept.E)

MEJ. Mansoura Engineering Journal ◽

10.21608/bfemu.2021.165478 ◽

2021 ◽

Vol 18 (3) ◽

pp. 1-12

Author(s):

Yehia Enab ◽

Fayez Zaki

Keyword(s):

Power Series ◽

Error Correction ◽

Error Detection ◽

Series Representation ◽

Error Correction Codes ◽

Logical Functions ◽

Power Series Representation

Download Full-text

Ultrafast Error Correction Codes for Double Error Detection/Correction

2016 12th European Dependable Computing Conference (EDCC) ◽

10.1109/edcc.2016.28 ◽

2016 ◽

Cited By ~ 6

Author(s):

Luis-J. Saiz-Adalid ◽

Pedro Gil ◽

Juan-Carlos Ruiz ◽

Joaquin Gracia-Moran ◽

Daniel Gil-Tomas ◽

...

Keyword(s):

Error Correction ◽

Error Detection ◽

Error Correction Codes

Download Full-text

Secure memories resistant to both random errors and fault injection attacks using nonlinear error correction codes

Proceedings of the 2nd International Workshop on Hardware and Architectural Support for Security and Privacy - HASP '13 ◽

10.1145/2487726.2487731 ◽

2013 ◽

Cited By ~ 8

Author(s):

Shizun Ge ◽

Zhen Wang ◽

Pei Luo ◽

Mark Karpovsky

Keyword(s):

Error Correction ◽

Fault Injection ◽

Error Correction Codes ◽

Random Errors ◽

Nonlinear Error ◽

Injection Attacks ◽

Fault Injection Attacks

Download Full-text

Error Correction Codes Derived from Orthogonal Latin Square Codes

International Journal of Recent Technology and Engineering - 2 ◽

10.35940/ijrte.b1205.0882s819 ◽

2019 ◽

Vol 8 (2S8) ◽

pp. 1948-1952

Keyword(s):

Error Correction ◽

Error Detection ◽

Latin Square ◽

Error Correction Codes ◽

Modular Construction ◽

Golay Code ◽

Technology Scaling ◽

Burst Error ◽

Majority Logic Decoding ◽

Ic Technology

The developments in IC technology and rapid increase of transistor densities and scaling factor, the use of ECC’s acquired prominence. Multiple bit errors in memories due to technology scaling demands advanced error correction codes. SEC-DEC, DEC, burst error detection, Golay code, Reed Solmon codes etc. have much decoding complexity and latency. The above drawbacks can be reduced with OLS codes. OLS codes with majority logic decoding technique, modular construction and simple decoding mechanisms it enables low delay improvements. MBU’S can be addressed using OLS-MLD codes. This paper presents a detail study of developments in multibit ECC’s using OLS-MLD mechanism

Download Full-text

A Cryptography Core Tolerant to DFA Fault Attacks

Journal of Integrated Circuits and Systems ◽

10.29292/jics.v2i1.231 ◽

2007 ◽

Vol 2 (1) ◽

pp. 14-21

Author(s):

Carlos R. Moratelli ◽

Érika Cota ◽

Marcelo S. Lubaszewski

Keyword(s):

Fault Detection ◽

Error Correction ◽

Fault Injection ◽

Smart Cards ◽

Experimental Results ◽

Transient Faults ◽

Fault Attacks ◽

Cryptographic Algorithms

This work describes a hardware approach for the concurrent fault detection and error correction in a cryptographic core. It has been shown in the literature that transient faults injected in a cryptographic core can lead to the revelation of the encryption key using quite inexpensive equipments. This kind of attack is a real threat to tamper resistant devices like Smart Cards. To tackle such attacks, the cryptographic core must be immune to transient faults. In this work the DES algorithm is taken as a vulnerable cryptosystem case study.We show how an attack against DES is performed through a fault injection campaign. Then, a countermeasure based on partial hardware replication is proposed and applied to DES. Experimental results show the efficiency of the proposed scheme to protect DES against DFA fault attacks. Furthermore, the proposed solution is independent of implementation, and can be applied to other cryptographic algorithms, such as AES.

Download Full-text

Fault Tolerance in Carbon Nanotube Transistors Based Multi Valued Logic

10.5772/intechopen.95361 ◽

2021 ◽

Author(s):

Gopalakrishnan Sundararajan

Keyword(s):

Fault Tolerance ◽

Error Correction ◽

Field Effect ◽

Fault Tolerant ◽

Field Effect Transistors ◽

Transient Faults ◽

Carbon Nanotube Transistors ◽

Nanotube Transistors ◽

Modular Redundancy ◽

Carbon Nano Tube

This Chapter presents a solution for fault-tolerance in Multi-Valued Logic (MVL) circuits comprised of Carbon Nano-Tube Field Effect Transistors (CNTFET). This chapter reviews basic primitives of MVL and describes ternary implementations of CNTFET circuits. Finally, this chapter describes a method for error correction called Restorative Feedback (RFB). The RFB method is a variant of Triple-Modular Redundancy (TMR) that utilizes the fault masking capabilities of the Muller C element to provide added protection against noisy transient faults. Fault tolerant properties of Muller C element is discussed and error correction capability of RFB method is demonstrated in detail.

Download Full-text

A Self-Checking Hardware Journal for a Fault-Tolerant Processor Architecture

International Journal of Reconfigurable Computing ◽

10.1155/2011/962062 ◽

2011 ◽

Vol 2011 ◽

pp. 1-15 ◽

Cited By ~ 3

Author(s):

Mohsin Amin ◽

Abbas Ramazani ◽

Fabrice Monteiro ◽

Camille Diou ◽

Abbas Dandache

Keyword(s):

Fault Tolerance ◽

Error Detection ◽

Error Control ◽

Fault Tolerant ◽

Error Rates ◽

Main Memory ◽

Transient Faults ◽

Processor Core ◽

Detection Techniques ◽

Performance Area

We introduce a specialized self-checking hardware journal being used as a centerpiece in our design strategy to build a processor tolerant to transient faults. Fault tolerance here relies on the use of error detection techniques in the processor core together with journalization and rollback execution to recover from erroneous situations. Effective rollback recovery is possible thanks to using a hardware journal and chosing a stack computing architecture for the processor core instead of the usual RISC or CISC. The main objective of the journalization and the hardware self-checking journal is to prevent data not yet validated to be sent to the main memory, and allow to fast rollback execution on faulty situations. The main memory, supposed to be fault secure in our model, only contains valid (uncorrupted) data obtained from fault-free computations. Error control coding techniques are used both in the processor core to detect errors and in the HW journal to protect the temporarily stored data from possible changes induced by transient faults. Implementation results on an FPGA of the Altera Stratix-II family show clearly the relevance of the approach, both in terms of performance/area tradeoff and fault tolerance effectiveness, even for high error rates.

Download Full-text

Compact and High-Speed Hsiao-Based SEC-DED Codec for Cache Memory

Journal of Circuits System and Computers ◽

10.1142/s0218126622500049 ◽

2021 ◽

pp. 2250004

Author(s):

Jagannath Samanta ◽

Akash Kewat

Keyword(s):

Error Correction ◽

Error Detection ◽

High Speed ◽

Memory Systems ◽

Cache Memory ◽

Error Correction Codes ◽

Parity Check Matrix ◽

Single Error ◽

Check Matrix ◽

Field Programmable

Recently, there have been continuous rising interests of multi-bit error correction codes (ECCs) for protecting memory cells from soft errors which may also enhance the reliability of memory systems. The single error correction and double error detection (SEC-DED) codes are generally employed in many high-speed memory systems. In this paper, Hsiao-based SEC-DED codes are optimized based on two proposed optimization algorithms employed in parity check matrix and error correction logic. Theoretical area complexity of SEC-DED codecs require maximum 49.29%, 18.64% and 49.21% lesser compared to the Hsiao codes [M. Y. Hsiao, A class of optimal minimum odd-weight-column SEC-DED codes, IBM J. Res. Dev. 14 (1970) 395–401], Reviriego et al. codes [P. Reviriego, S. Pontarelli, J. A. Maestro and M. Ottavi, A method to construct low delay single error correction codes for protecting data bits only, IEEE Trans. Comput.-Aided Des. Integr. Circuits Syst. 32 (2013) 479–483] and Liu et al. codes [S. Liu, P. Reviriego, L. Xiao and J. A. Maestro, A method to recover critical bits under a double error in SEC-DED protected memories, Microelectron. Reliab. 73 (2017) 92–96], respectively. Proposed codec is designed and implemented both in field programmable gate array (FPGA) and ASIC platforms. The synthesized SEC-DED codecs need 31.14% lesser LUTs than the original Hsiao code. Optimized codec is faster than the existing related codec without affecting its power consumption. These compact and faster SEC-DED codecs are employed in cache memory to enhance the reliability.

Download Full-text

Two control-flow error recovery methods for multithreaded programs running on multi-core processors

Facta universitatis - series Electronics and Energetics ◽

10.2298/fuee1503309k ◽

2015 ◽

Vol 28 (3) ◽

pp. 309-323 ◽

Cited By ~ 1

Author(s):

Navid Khoshavi ◽

Hamid Zarandi ◽

Mohammad Maghsoudloo

Keyword(s):

Error Detection ◽

Data Flow ◽

Fault Injection ◽

Error Recovery ◽

Control Flow ◽

Transient Faults ◽

Multithreaded Programs ◽

Recovery Techniques ◽

And Performance ◽

Using Data

This paper presents two control-flow error recovery techniques, CFE Recovery using Data-flow graph Consideration and CFE Recovery using Macro block-level Check pointing. These techniques are proposed with regards to thread interactions in the programs. These techniques try to moderate the high memory and performance overheads of conventional control-flow checking techniques. The proposed recovery techniques are composed of two phases of control-flow error detection and recovery. These phases are designed by means of inserting additional instructions into program at compile time considering dependency graph, extracted from control-flow and data-flow dependencies among basic blocks and thread interactions in the programs. In order to evaluate the proposed techniques, five multithreaded benchmarks are utilized to run on a multi-core processor. Moreover, a total of 10000 transient faults have been injected into several executable points of each program. Fault injection experiments show that the proposed techniques recover the detected errors at-least for 91% of the cases.

Download Full-text