Advanced Technologies for Transient Faults Detection and Compensation

Transient faults became an increasing issue in the past few years as smaller geometries of newer, highly miniaturized, silicon manufacturing technologies brought to the mass-market failure mechanisms traditionally bound to niche markets as electronic equipments for avionic, space or nuclear applications. This chapter presents the origin of transient faults, it discusses the propagation mechanism, it outlines models devised to represent them and finally it discusses the state-of-the-art design techniques that can be used to detect and correct transient faults. The concepts of hardware, data and time redundancy are presented, and their implementations to cope with transient faults affecting storage elements, combinational logic and IP-cores (e.g., processor cores) typically found in a System-on-Chip are discussed.

Download Full-text

Precise Cache Profiling for Studying Radiation Effects

ACM Transactions on Embedded Computing Systems ◽

10.1145/3442339 ◽

2021 ◽

Vol 20 (3) ◽

pp. 1-25

Author(s):

James Marshall ◽

Robert Gifford ◽

Gedare Bloom ◽

Gabriel Parmer ◽

Rahul Simha

Keyword(s):

Radiation Effects ◽

Fault Injection ◽

Error Correcting Codes ◽

Direct Access ◽

Transient Faults ◽

Large Area ◽

Common Multiple ◽

Single Event Upsets ◽

On Chip ◽

Future Work

Increased access to space has led to an increase in the usage of commodity processors in radiation environments. These processors are vulnerable to transient faults such as single event upsets that may cause bit-flips in processor components. Caches in particular are vulnerable due to their relatively large area, yet are often omitted from fault injection testing because many processors do not provide direct access to cache contents and they are often not fully modeled by simulators. The performance benefits of caches make disabling them undesirable, and the presence of error correcting codes is insufficient to correct for increasingly common multiple bit upsets. This work explores building a program’s cache profile by collecting cache usage information at an instruction granularity via commonly available on-chip debugging interfaces. The profile provides a tighter bound than cache utilization for cache vulnerability estimates (50% for several benchmarks). This can be applied to reduce the number of fault injections required to characterize behavior by at least two-thirds for the benchmarks we examine. The profile enables future work in hardware fault injection for caches that avoids the biases of existing techniques.

Download Full-text

Modelling the OFDM-based PHY Layer in SoC for Visible Light Communication

International Journal of Recent Contributions from Engineering Science & IT (iJES) ◽

10.3991/ijes.v7i3.10695 ◽

2019 ◽

Vol 7 (3) ◽

pp. 79

Author(s):

Erwin Setiawan ◽

Trio Adiono ◽

Syifaul Fuada

Keyword(s):

Visible Light ◽

Data Exchange ◽

Visible Light Communication ◽

System On Chip ◽

Matlab Simulation ◽

Industry Standard ◽

Receiver Block ◽

Phy Layer ◽

Ip Cores ◽

On Chip

In this paper, we report a System-on-Chip (SoC) architecture for OFDM-based Visible Light Communication (VLC). The OFDM block was implemented as VLC PHY layer. The OFDM block comprises of transmitter and receiver. In transmitter block, there are Reed-Solomon encoder, modulator, IFFT, and preamble generator. While in receiver block, there are Reed-Solomon decoder, demodulator, FFT, and synchronizer. In SoC, these blocks are designed as IP cores. The industry standard AXI4-Stream protocol was used for data exchange between IP cores. The OFDM model in SoC was verified by comparing with a MATLAB simulation.

Download Full-text

Online fault diagnosis of wireless sensor networks

Open Computer Science ◽

10.2478/s13537-014-0203-8 ◽

2014 ◽

Vol 4 (1) ◽

Cited By ~ 3

Author(s):

Arunanshu Mahapatro ◽

Pabitra Khilar

Keyword(s):

Wireless Sensor Networks ◽

Sensor Networks ◽

Fault Diagnosis ◽

Wireless Sensor ◽

Transient Faults ◽

Intermittent Faults ◽

Detection Latency ◽

Time Redundancy ◽

Simulation Results ◽

Time And Energy

AbstractThis paper proposes an adaptive online distributed solution for fault diagnosis in wireless sensor networks (WSNs). Fault diagnosis is achieved by comparing the heartbeat message generated by neighboring nodes and dissemination of decision made at each node. Time redundancy is used to detect the intermittent faults since an intermittent fault will not occur consistently. The diagnosis performance degradation due to intermittent faults in sensing and transient faults in communication is analyzed. A near optimal trade-off between detection latency and number of tests required to detect intermittent faults is obtained. Simulation results are provided and they show that this work performs better, from both time and energy complexity viewpoint.

Download Full-text

Task mapping and scheduling for network-on-chip based multi-core platform with transient faults

Journal of Systems Architecture ◽

10.1016/j.sysarc.2018.01.002 ◽

2018 ◽

Vol 83 ◽

pp. 34-56 ◽

Cited By ~ 8

Author(s):

Navonil Chatterjee ◽

Suraj Paul ◽

Santanu Chattopadhyay

Keyword(s):

Network On Chip ◽

Task Mapping ◽

Transient Faults ◽

On Chip

Download Full-text

Run Control Software For The Upgrade Of The Atlas Muon To Central Trigger Processor Interface (MUCTPI)

EPJ Web of Conferences ◽

10.1051/epjconf/201921401034 ◽

2019 ◽

Vol 214 ◽

pp. 01034

Author(s):

Ralf Spiwoks ◽

Aaron Armbruster ◽

German Carrillo-Montoya ◽

Magda Chelstowska ◽

Patrick Czodrowski ◽

...

Keyword(s):

Hadron Collider ◽

Programmable Logic ◽

Trigger System ◽

Control Software ◽

Atlas Experiment ◽

Arm Processor ◽

Level 1 ◽

On Chip ◽

Control Application ◽

Processor Cores

The Muon to Central Trigger Processor Interface (MUCTPI) of the ATLAS experiment at the Large Hadron Collider(LHC) at CERN is being upgraded for the next run of the LHC in order to use optical inputs and to provide full-precision information for muon candidates to the topological trigger processor (L1TOPO) of the Level-1 trigger system. The new MUCTPI is implemented as a single ATCA blade with high-end processing FPGAs which eliminate doublecounting of muon candidates in overlapping regions, send muon candidates to L1TOPO, and muon multiplicities tothe Central Trigger Processor (CTP), as well as readout data to the data acquisition system of the experiment. A Xilinx Zynq System-on-Chip (SoC) with a programmable logic part and a processor part is used for the communication to the processing FPGAs and the run control system. The processor part, based on ARM processor cores, is running embedded Linux prepared using the framework of the Linux Foundation's Yocto project. The ATLAS run control software was ported to the processor part and a run control application was developed which receives, at configuration, all data necessary for the overlap handling and candidate counting of the processing FPGAs. During running, the application provides ample monitoring of the physics data and of the operation of the hardware. *

Download Full-text

A New Fault Tolerant Routing Algorithm for Networks on Chip

International Journal of Embedded and Real-Time Communication Systems ◽

10.4018/ijertcs.2019070105 ◽

2019 ◽

Vol 10 (3) ◽

pp. 68-85

Author(s):

Chakib Nehnouh ◽

Mohamed Senouci

Keyword(s):

Network Architecture ◽

Fault Tolerant ◽

Routing Algorithm ◽

Transient Faults ◽

Networks On Chip ◽

Congestion Detection ◽

Novel Approach ◽

On Chip ◽

Detection Mechanisms ◽

Correct Data

To provide correct data transmission and to handle the communication requirements, the routing algorithm should find a new path to steer packets from the source to the destination in a faulty network. Many solutions have been proposed to overcome faults in network-on-chips (NoCs). This article introduces a new fault-tolerant routing algorithm, to tolerate permanent and transient faults in NoCs. This solution called DINRA can satisfy simultaneously congestion avoidance and fault tolerance. In this work, a novel approach inspired by Catnap is proposed for NoCs using local and global congestion detection mechanisms with a hierarchical sub-network architecture. The evaluation (on reliability, latency and throughput) shows the effectiveness of this approach to improve the NoC performances compared to state of art. In addition, with the test module and fault register integrated in the basic architecture, the routers are able to detect faults dynamically and re-route packets to fault-free and congestion-free zones.

Download Full-text

A Framework to Compare Estimated and Measured Power Consumption on FPGAs

Journal of Low Power Electronics ◽

10.1166/jolpe.2019.1622 ◽

2019 ◽

Vol 15 (4) ◽

pp. 329-337

Author(s):

Juan P. Oliver ◽

Federico Favaro ◽

Eduardo Boemo

Keyword(s):

Power Consumption ◽

Meta Analysis ◽

Main Idea ◽

Power Estimation ◽

Pattern Generator ◽

Analysis Techniques ◽

Input Signals ◽

Ip Cores ◽

On Chip ◽

Relative Errors

In this paper, an extensive review of the available publications about comparing estimations versus measurements of power consumption in FPGA technology is carried out. This study reveals that the variety of experimental setups makes it difficult to elaborate solid studies departing from the results of different researchers using meta-analysis techniques. To mitigate this problem, we propose a procedure to standardize the setup of FPGA power estimation experiments. The goal is to make as close as possible power estimations and their corresponding actual on-chip measurements. The main idea is to use a fixed arrangement composed by a parameterized pattern generator block at the input, together with a set of interchangeable IP cores utilized as reference circuits. All the blocks are mapped together inside the FPGA sample, being the clock and reset lines the sole input signals. Thus, both power estimation and actual measurements are performed to the whole system in identical conditions. In order to illustrate the method, the paper includes some examples of the proposed methodology for different cores. A set of 25 circuits have been tested in two FPGA families, obtaining relative errors in power estimation between –61.5% and 9.2%.

Download Full-text

A Survey of Network-on-Chip Security Attacks and Countermeasures

ACM Computing Surveys ◽

10.1145/3450964 ◽

2021 ◽

Vol 54 (5) ◽

pp. 1-36

Author(s):

Subodha Charles ◽

Prabhat Mishra

Keyword(s):

Focal Point ◽

Network On Chip ◽

Security Attacks ◽

Security Vulnerabilities ◽

Multicore System ◽

Comprehensive Survey ◽

On Chip ◽

Manufacturing Technologies ◽

Significant Attention ◽

Prime Location

With the advances of chip manufacturing technologies, computer architects have been able to integrate an increasing number of processors and other heterogeneous components on the same chip. Network-on-Chip (NoC) is widely employed by multicore System-on-Chip (SoC) architectures to cater to their communication requirements. NoC has received significant attention from both attackers and defenders. The increased usage of NoC and its distributed nature across the chip has made it a focal point of potential security attacks. Due to its prime location in the SoC coupled with connectivity with various components, NoC can be effectively utilized to implement security countermeasures to protect the SoC from potential attacks. There is a wide variety of existing literature on NoC security attacks and countermeasures. In this article, we provide a comprehensive survey of security vulnerabilities in NoC-based SoC architectures and discuss relevant countermeasures.

Download Full-text

Modification of fault injection method via on-chip debugging for processor cores of systems-on-chip

2015 International Siberian Conference on Control and Communications (SIBCON) ◽

10.1109/sibcon.2015.7147267 ◽

2015 ◽

Cited By ~ 4

Author(s):

S.A. Chekmarev ◽

V.Kh. Khanov ◽

O.A. Antamoshkin

Keyword(s):

Fault Injection ◽

Injection Method ◽

Systems On Chip ◽

On Chip ◽

Processor Cores

Download Full-text