Design and Test Technology for Dependable Systems-on-Chip - Advances in Computer and Electrical Engineering

Transient faults became an increasing issue in the past few years as smaller geometries of newer, highly miniaturized, silicon manufacturing technologies brought to the mass-market failure mechanisms traditionally bound to niche markets as electronic equipments for avionic, space or nuclear applications. This chapter presents the origin of transient faults, it discusses the propagation mechanism, it outlines models devised to represent them and finally it discusses the state-of-the-art design techniques that can be used to detect and correct transient faults. The concepts of hardware, data and time redundancy are presented, and their implementations to cope with transient faults affecting storage elements, combinational logic and IP-cores (e.g., processor cores) typically found in a System-on-Chip are discussed.

Download Full-text

Optimizing Fault Tolerance for Multi-Processor System-on-Chip

Design and Test Technology for Dependable Systems-on-Chip - Advances in Computer and Electrical Engineering ◽

10.4018/978-1-60960-212-3.ch003 ◽

2011 ◽

pp. 66-91 ◽

Cited By ~ 4

Author(s):

Dimitar Nikolov ◽

Mikael Väyrynen ◽

Urban Ingelsson ◽

Virendra Singh ◽

Erik Larsson

Keyword(s):

Fault Tolerance ◽

Error Probability ◽

Fault Tolerant ◽

General Purpose ◽

System On Chip ◽

Probability Estimation ◽

Communication Overhead ◽

Mathematical Framework ◽

Safety Critical ◽

On Chip

While the rapid development in semiconductor technologies makes it possible to manufacture integrated circuits (ICs) with multiple processors, so called Multi-Processor System-on-Chip (MPSoC), ICs manufactured in recent semiconductor technologies are becoming increasingly susceptible to transient faults, which enforces fault tolerance. Work on fault tolerance has mainly focused on safety-critical applications; however, the development of semiconductor technologies makes fault tolerance also needed for general-purpose systems. Different from safety-critical systems where meeting hard deadlines is the main requirement, it is for general-purpose systems more important to minimize the average execution time (AET). The contribution of this chapter is two-fold. First, the authors present a mathematical framework for the analysis of AET. Their analysis of AET is performed for voting, rollback recovery with checkpointing (RRC), and the combination of RRC and voting (CRV) where for a given job and soft (transient) error probability, the authors define mathematical formulas for each of the fault-tolerant techniques with the objective to minimize AET while taking bus communication overhead into account. And, for a given number of processors and jobs, the authors define integer linear programming models that minimize AET including communication overhead. Second, as error probability is not known at design time and it can change during operation, they present two techniques, periodic probability estimation (PPE) and aperiodic probability estimation (APE), to estimate the error probability and adjust the fault tolerant scheme while the IC is in operation.

Download Full-text

Thermal-Aware SoC Test Scheduling

Design and Test Technology for Dependable Systems-on-Chip - Advances in Computer and Electrical Engineering ◽

10.4018/978-1-60960-212-3.ch019 ◽

2011 ◽

pp. 413-433 ◽

Cited By ~ 1

Author(s):

Zhiyuan He ◽

Zebo Peng ◽

Petru Eles

Keyword(s):

High Performance ◽

Deep Submicron ◽

Test Time ◽

Test Scheduling ◽

Cooling Period ◽

Time Minimization ◽

Partition Test ◽

Alternative Test ◽

Test Sets ◽

On Chip

High temperature has become a technological barrier to the testing of high performance systems-on-chip, especially when deep submicron technologies are employed. In order to reduce test time while keeping the temperature of the cores under test within a safe range, thermal-aware test scheduling techniques are required. In this chapter, the authors address the test time minimization problem as how to generate the shortest test schedule such that the temperature limits of individual cores and the limit on the test-bus bandwidth are satisfied. In order to avoid overheating during the test, the authors partition test sets into shorter test sub-sequences and add cooling periods in between, such that applying a test sub-sequence will not drive the core temperature going beyond the limit. Furthermore, based on the test partitioning scheme, the authors interleave the test sub-sequences from different test sets in such a manner that a cooling period reserved for one core is utilized for the test transportation and application of another core. The authors have proposed an approach to minimize the test application time by exploring alternative test partitioning and interleaving schemes with variable length of test sub-sequences and cooling periods as well as alternative test schedules. Experimental results have shown the efficiency of the proposed approach.

Download Full-text

SoC Self Test Based on a Test-Processor

Design and Test Technology for Dependable Systems-on-Chip - Advances in Computer and Electrical Engineering ◽

10.4018/978-1-60960-212-3.ch016 ◽

2011 ◽

pp. 360-376

Author(s):

Tobial Koal ◽

Rene Kothe ◽

Heinrich Theodor Vierhaus

Keyword(s):

Research Work ◽

Minimum Cost ◽

Test Scheme ◽

Electronic Systems ◽

Start Up ◽

Test Technology ◽

Testing Complex ◽

Self Test ◽

And Control ◽

Ic Test

Testing complex systems on a chip (SoCs) with up to billions of transistors has been a challenge to IC test technology for more than a decade. Most of the research work in IC test technology has focused on problems of production testing, while the problem of self test in the field of application has found much less attention. With SoCs being used also in long-living systems for safety critical applications, such enhanced self test capabilities become essential for the dependability of the host system. For example, automotive electronic systems must be capable of performing a fast and effective start-up self test. For future self-repairing systems, fault diagnosis will become necessary, since it is the base for dedicated system re-configuration. One way to solve this problem is a hierarchical self-test scheme for embedded SoCs, based on hardware and software. The core of the test architecture then is a test processor device, which is optimised to organize and control test functions efficiently and at minimum cost. This device must be highly reliable by itself. The chapter introduces the basic concept of hierarchical HW / SW based self test, the test processor concept and architecture, and its role in a hierarchical self test scheme for SoCs.

Download Full-text

Diagnostic Modeling of Digital Systems with Multi-Level Decision Diagrams

Design and Test Technology for Dependable Systems-on-Chip - Advances in Computer and Electrical Engineering ◽

10.4018/978-1-60960-212-3.ch004 ◽

2011 ◽

pp. 92-118 ◽

Cited By ~ 5

Author(s):

Raimund Ubar ◽

Jaan Raik ◽

Artur Jutman ◽

Maksim Jenihhin

Keyword(s):

Symbolic Execution ◽

Digital Systems ◽

Decision Diagrams ◽

Levels Of Abstraction ◽

Multi Level ◽

High Level ◽

Iterative Synthesis ◽

And Behavior ◽

Diagnostic Modeling ◽

Logic Level

In order to cope with the complexity of today’s digital systems in diagnostic modeling, hierarchical multi-level approaches should be used. In this chapter, the possibilities of using Decision Diagrams (DD) for uniform diagnostic modeling of digital systems at different levels of abstraction are discussed. DDs can be used for modeling the functions and faults of systems at logic, register transfer and behavior like instruction set architecture levels. The authors differentiate two general types of DDs – logic level binary DDs (BDD) and high level DDs (HLDD). Special classes of BDDs are described: structurally synthesized BDDs (SSBDD) and structurally synthesized BDDs with multiple inputs (SSMIBDD). A method of iterative synthesis of SSBDDs and SSMIBDDs is discussed. Three methods for synthesis of HLDDs for representing digital systems at higher levels are described: iterative superposition of HLDDs for high-level structural representations of systems, symbolic execution of procedural descriptions for functional representations of systems, and creation of vector HLDDs (VHLDD) on the basis of using shared HLDDs for compact representing of a given set of high level functions. The nodes in DDs can be modeled as generic locations of faults. For more precise general specification of faults different logic constraints are used. A functional fault model to map the low level faults to higher levels, particularly, to map physical defects from transistor level to logic level is discussed.

Download Full-text

Sequential Test Set Compaction in LFSR Reseeding

Design and Test Technology for Dependable Systems-on-Chip - Advances in Computer and Electrical Engineering ◽

10.4018/978-1-60960-212-3.ch022 ◽

2011 ◽

pp. 476-493

Author(s):

Artur Jutman ◽

Igor Aleksejev ◽

Jaan Raik

Keyword(s):

Optimization Technique ◽

Fault Coverage ◽

Sequential Test ◽

Test Sequence ◽

Great Similarity ◽

Sequential Designs ◽

Pseudorandom Sequences ◽

Test Set ◽

Embedded Test ◽

Test Compaction

This chapter further details the topic of embedded self-test directing the reader towards the aspects of embedded test generation and test sequence optimization. The authors will brief the basics of widely used pseudorandom test generators and consider different techniques targeting the optimization of fault coverage characteristics of generated sequences. The authors will make the main focus on one optimization technique that is applicable to reseeding-based test generators and that uses a test compaction methodology. The technique exploits a great similarity in the way the faults are covered by pseudorandom sequences and by patterns generated for sequential designs. Hence, the test compaction methodology previously developed for the latter problem can be successfully reused in embedded testing.

Download Full-text

Low Power Testing

Design and Test Technology for Dependable Systems-on-Chip - Advances in Computer and Electrical Engineering ◽

10.4018/978-1-60960-212-3.ch018 ◽

2011 ◽

pp. 395-412

Author(s):

Zdenek Kotásek ◽

Jaroslav Škarvada

Keyword(s):

Embedded Systems ◽

Power Consumption ◽

Low Power ◽

Low Power Consumption ◽

Operational Mode ◽

Dynamic Power ◽

Low Power Testing ◽

Test Application ◽

Basic Concepts ◽

Power Testing

Portable computer systems and embedded systems are examples of electronic devices which are powered from batteries; therefore, they are designed with the goal of low power consumption. Low power consumption becomes important not only during normal operational mode, but during test application as well when switching activity is higher than in normal mode. In this chapter, a survey of basic concepts and methodologies from the area of low power testing is provided. It is explained here how power consumption is related to switching activities during test application. The concepts of static and dynamic power consumption are discussed together with metrics which can be possibly used to evaluate power consumption. The survey of methods, the goal of which is to reduce dynamic power consumption during test application, is then provided followed by a short survey of power-constrained test scheduling methods.

Download Full-text

High-Speed Logic Level Fault Simulation

Design and Test Technology for Dependable Systems-on-Chip - Advances in Computer and Electrical Engineering ◽

10.4018/978-1-60960-212-3.ch014 ◽

2011 ◽

pp. 310-337

Author(s):

Raimund Ubar ◽

Sergei Devadze

Keyword(s):

High Speed ◽

Fault Tolerant ◽

Critical Path ◽

Fault Simulation ◽

Maximum Size ◽

Fault Model ◽

Fault Analysis ◽

Internal Variables ◽

Simulation Techniques ◽

Logic Level

In the first part of the chapter, an introduction to the problem of logic level fault simulation is given together with the overview of existing fault simulation techniques. The remaining part of the chapter describes a new approach to fault simulation based on exact critical path tracing to conduct fault analysis in logic circuits. A circuit topology driven computational model is presented which allows not only to cope with complex structures of nested reconvergent fan-outs but also to carry out the fault reasoning for many test patterns concurrently. To achieve the speed-up of backtracing, the circuit is simulated on higher than traditional gate level. As components of the circuit network, fan-out free regions of maximum size are considered, and they are represented by structurally synthesized BDDs. The latter allow to reduce the number of internal variables in the computation model, and therefore to process the whole circuit faster than on the flat gate-level. The method is explained first, for the stuck-at fault model, and then generalized for an extended class of functional fault model covering the conditional stuck-at and transition faults. The method can be used for simulating permanent faults in combinational circuits, and transient or intermittent faults both in combinational and sequential circuits with the goal of selecting malicious faults for injecting into fault tolerant systems to evaluate their dependability. Experimental results are included to give an idea how efficiently the method works with different fault classes.

Download Full-text

Built-in Self Repair for Logic Structures

Design and Test Technology for Dependable Systems-on-Chip - Advances in Computer and Electrical Engineering ◽

10.4018/978-1-60960-212-3.ch010 ◽

2011 ◽

pp. 216-240 ◽

Cited By ~ 5

Author(s):

Tobias Koal ◽

Heinrich Theodor Vierhaus

Keyword(s):

Software Systems ◽

Gate Arrays ◽

Regular Structures ◽

Permanent Faults ◽

Integrated Devices ◽

Field Programmable ◽

Programmable Gate Arrays ◽

Self Repair ◽

Logic Structures ◽

Time Of Operation

For several years, many authors have predicted that nano-scale integrated devices and circuits will have a rising sensitivity to both transient and permanent faults effects. Essentially, there seems to be an emerging demand for building highly dependable hardware / software systems from unreliable components. Most of the effort has so far gone into the detection and compensation of transient fault effects. More recently, also the possibility of repairing permanent faults, due to either production flaws or to wear-out effects after some time of operation in the field of application, needs further investigation. While built-in self test (BIST) and even self repair (BISR) for regular structures such as static memories (SRAMs) is well understood, concepts for in-system repair of irregular logic and interconnects are few and mainly based on field-programmable gate-arrays (FPGAs) as the basic implementation. In this chapter, the authors try to analyse different schemes of logic (self-) repair with respect to cost and limitations, using repair schemes that are not based on FPGAs. It can be shown that such schemes are feasible, but need lot of attention in terms of hidden single points of failure.

Download Full-text

Synthesis of Flexible Fault-Tolerant Schedules for Embedded Systems with Soft and Hard Timing Constraints

Design and Test Technology for Dependable Systems-on-Chip - Advances in Computer and Electrical Engineering ◽

10.4018/978-1-60960-212-3.ch002 ◽

2011 ◽

pp. 37-65

Author(s):

Viacheslav Izosimov ◽

Paul Pop ◽

Petru Eles ◽

Zebo Peng

Keyword(s):

Embedded Systems ◽

Fault Tolerant ◽

Real Life ◽

Timing Constraints ◽

Present Evaluation

The authors also present evaluation of the schedule synthesis heuristics with and without preemption using extensive experiments and a real-life example.

Download Full-text

Design and Test Technology for Dependable Systems-on-Chip - Advances in Computer and Electrical Engineering
Latest Publications

TOTAL DOCUMENTS

H-INDEX

Published By IGI Global

Advanced Technologies for Transient Faults Detection and Compensation

Optimizing Fault Tolerance for Multi-Processor System-on-Chip

Thermal-Aware SoC Test Scheduling

SoC Self Test Based on a Test-Processor

Diagnostic Modeling of Digital Systems with Multi-Level Decision Diagrams

Sequential Test Set Compaction in LFSR Reseeding

Low Power Testing

High-Speed Logic Level Fault Simulation

Built-in Self Repair for Logic Structures

Synthesis of Flexible Fault-Tolerant Schedules for Embedded Systems with Soft and Hard Timing Constraints

Export Citation Format

Design and Test Technology for Dependable Systems-on-Chip - Advances in Computer and Electrical EngineeringLatest Publications

TOTAL DOCUMENTS

H-INDEX

Published By IGI Global

Advanced Technologies for Transient Faults Detection and Compensation

Optimizing Fault Tolerance for Multi-Processor System-on-Chip

Thermal-Aware SoC Test Scheduling

SoC Self Test Based on a Test-Processor

Diagnostic Modeling of Digital Systems with Multi-Level Decision Diagrams

Sequential Test Set Compaction in LFSR Reseeding

Low Power Testing

High-Speed Logic Level Fault Simulation

Built-in Self Repair for Logic Structures

Synthesis of Flexible Fault-Tolerant Schedules for Embedded Systems with Soft and Hard Timing Constraints

Design and Test Technology for Dependable Systems-on-Chip - Advances in Computer and Electrical Engineering
Latest Publications