Comparing the effects of intermittent and transient hardware faults on programs

The paper proposes a new reliable fault-tolerant scheduling algorithm for real-time embedded systems. The proposed algorithm is based on static scheduling that allows to include the dependencies and the execution cost of tasks and data dependencies in its scheduling decisions. Our scheduling algorithm is dedicated to multi-bus heterogeneous architectures with multiple processors linked by several shared buses. This scheduling algorithm is considering only one bus fault caused by hardware faults and compensated by software redundancy solutions. The proposed algorithm is based on both active and passive backup copies to minimize the scheduling length of data on buses. In the experiments, the proposed methods are evaluated in terms of data scheduling length for a set of DSP benchmarks. The experimental results show the effectiveness of our technique.

Download Full-text

LLFI: An Intermediate Code-Level Fault Injection Tool for Hardware Faults

2015 IEEE International Conference on Software Quality, Reliability and Security ◽

10.1109/qrs.2015.13 ◽

2015 ◽

Cited By ~ 19

Author(s):

Qining Lu ◽

Mostafa Farahani ◽

Jiesheng Wei ◽

Anna Thomas ◽

Karthik Pattabiraman

Keyword(s):

Fault Injection ◽

Hardware Faults

Download Full-text

Cross-layer system reliability assessment framework for hardware faults

2016 IEEE International Test Conference (ITC) ◽

10.1109/test.2016.7805863 ◽

2016 ◽

Cited By ~ 11

Author(s):

A. Vallero ◽

A. Savino ◽

G. Politano ◽

S. Di Carlo ◽

A. Chatzidimitriou ◽

...

Keyword(s):

System Reliability ◽

Reliability Assessment ◽

Cross Layer ◽

Assessment Framework ◽

Layer System ◽

Hardware Faults

Download Full-text

An Efficient Fault-Tolerant Multi-Bus Data Scheduling Algorithm Based on Replication and Deallocation

Cybernetics and Information Technologies ◽

10.1515/cait-2016-0021 ◽

2016 ◽

Vol 16 (2) ◽

pp. 69-84

Author(s):

Chafik Arar ◽

Mohamed Salah Khireddine

Keyword(s):

Embedded Systems ◽

Real Time ◽

Fault Tolerant ◽

Scheduling Algorithm ◽

Experimental Results ◽

Data Scheduling ◽

Heterogeneous Architectures ◽

Hardware Faults

Abstract The paper proposes a new reliable fault-tolerant scheduling algorithm for real-time embedded systems. The proposed scheduling algorithm takes into consideration only one bus fault in multi-bus heterogeneous architectures, caused by hardware faults and compensated by software redundancy solutions. The proposed algorithm is based on both active and passive backup copies, to minimize the scheduling length of data on buses. In the experiments, this paper evaluates the proposed methods in terms of data scheduling length for a set of DAG benchmarks. The experimental results show the effectiveness of our technique.

Download Full-text

Superposed Redundancy Approach for Building Reliable Communication in Multi-Bus Heterogeneous Systems

International Journal of Embedded and Real-Time Communication Systems ◽

10.4018/ijertcs.2019010101 ◽

2019 ◽

Vol 10 (1) ◽

pp. 1-21

Author(s):

Chafik Arar

Keyword(s):

Fault Tolerant ◽

Scheduling Algorithm ◽

Heterogeneous Systems ◽

List Scheduling ◽

Reliable Communication ◽

Data Scheduling ◽

New Variant ◽

Hardware Faults

In this article, the author uses a new variant of passive redundancy, which allows for a fictitious dual assignment by simultaneously scheduling two backup copies that overlap on the same communication bus at a given time. The proposed reliable fault tolerant greedy list scheduling algorithm is based on a superposed backup copy. This scheduling algorithm is considering up to n communication buses faults, caused by hardware faults and compensated by software redundancy solutions. it allows a reliable communication and efficient use of buses. In the experiments, the proposed methods are evaluated in terms of data scheduling length for a set of DSP benchmarks from the DSPstone.

Download Full-text

Reconfigurable Embedded Control Systems

Reconfigurable Embedded Control Systems ◽

10.4018/978-1-60960-086-0.ch010 ◽

2011 ◽

pp. 235-273

Author(s):

Mohamed Khalgui ◽

Olfa Mosbahi

Keyword(s):

Control Systems ◽

Research Work ◽

User Requirements ◽

Multi Agent Systems ◽

Embedded Control ◽

Embedded Control Systems ◽

Temporal Properties ◽

Logic Computation ◽

Multi Agent ◽

Hardware Faults

The chapter deals with distributed multi-agent reconfigurable embedded control systems following the component-based International Industrial Standard IEC61499 in which a Function Block (abbreviated by FB) is an event-triggered software component owning data and a control application is a distributed network of Function Blocks that have classically to satisfy functional and to meet temporal properties described in user requirements. The authors define a new reconfiguration semantic where a crucial criterion to consider is the automatic improvement of the system’s performance at run-time, in addition to its protection when hardware faults occur. To handle all possible cases in industry, the authors classify thereafter the reconfiguration scenarios into three forms before the authors define an architecture of reconfigurable multi-agent systems where a Reconfiguration Agent is affected to each device of the execution environment to apply local reconfigurations, and a Coordination Agent is proposed for any coordination between devices in order to guarantee safe and adequate distributed reconfigurations. A Communication Protocol is proposed in our research work to handle coordinations between agents by using well-defined Coordination Matrices. The authors specify both the reconfiguration agents to be modelled by nested state machines, and the Coordination Agent according to the formalism Net Condition/Event Systems (Abbreviated by NCES) which is an extension of Petri nets. To verify the whole architecture, the author check by applying the model checker SESA in each device functional and temporal properties described in the temporal logic “Computation Tree Logic”, but the authors have also to check any coordination between devices by verifying that whenever a reconfiguration is applied in a device, the Coordination Agent and other concerned devices should react as described in user requirements. The chapter’s contributions are applied to two Benchmark Production Systems available in our research laboratory.

Download Full-text