scholarly journals Dependability Aspects in Configurable Embedded Operating Systems

Author(s):  
Horst Schirmeier ◽  
Christoph Borchert ◽  
Martin Hoffmann ◽  
Christian Dietrich ◽  
Arthur Martens ◽  
...  

AbstractAs all conceptual layers in the software stack depend on the operating system (OS) to reliably provide resource-management services and isolation, it can be considered the “reliable computing base” that must be hardened for correct operation under fault models such as transient hardware faults in the memory hierarchy. In this chapter, we approach the problem of system-software hardening in three complementary scenarios. (1) We address the following research question: Where do the general reliability limits of static system-software stacks lie, if designed from scratch with reliability as a first-class design goal? In order to reduce the proverbial “attack surface” as far as possible, we harness static application knowledge from an AUTOSAR-compliant task set, and protect the whole OS kernel with AN-encoding. This static approach yields an extremely reliable software system, but is constrained to specific application domains. (2) We investigate how reliable a dynamic COTS embedded OS can become if hardened with programming-language and compiler-based fault-tolerance techniques. We show that aspect-oriented programming is an appropriate means to encapsulate generic software-implemented hardware fault tolerance mechanisms that can be application-specifically applied to a selection of OS components. (3) We examine how system-software stacks can survive even more adverse fault models like whole-system outages, using emerging persistent memory (PM) technology as a vehicle for state conservation. Our findings include that software transactional memory facilitates maintaining consistent state within PM and allows fast recovery.

Author(s):  
Vincenzo De Florio

After having described the main characteristics of dependability and fault-tolerance, it is analyzed here in more detail what it means that a program is fault-tolerant and what are the properties expected from a fault-tolerant program. The main objective of this chapter is introducing two sets of design assumptions that shape the way our fault-tolerant software is structured—the system and the fault models. Often misunderstood or underestimated, those models describe • what is expected from the execution environment in order to let our software system function correctly, and • what are the faults that our system is going to consider. Note that a fault-tolerant program shall (try to) tolerate only those faults stated in the fault model, and will be as defenseless against all other faults as any non fault-tolerant program. Together with the system specification, the fault and system models represent the foundation on top of which our computer services are built. It is not surprising that weak foundations often result in failing constructions. What is really surprising is that in so many cases, little or no attention had been given to those important factors in fault-tolerant software engineering. To give an idea of this, three wellknown accidents are described—the Ariane 5 flight 501, Mariner-1 disasters, and the Therac-25 accidents. In each case it is stressed what went wrong, what were the biggest mistakes, and how a careful understanding of fault models and system models would have helped highlighting the path to avoid catastrophic failures that cost considerable amounts of money and even the lives of innocent people. The other important objective of this chapter is introducing the core subject of this book: Software fault-tolerance situated at the level of the application layer. First of all, it is explained why targeting (also) the application layer is not an open option but a mandatory design choice for effective fault-tolerant software engineering. Secondly, given the peculiarities of the application layer, three properties to measure the quality of the methods to achieve fault-tolerant application software are introduced: 1. Separation of design concerns, that is, how good the method is in keeping the functional aspects and the fault-tolerance aspects separated from each other. 2. Syntactical adequacy, namely how versatile the employed method is in including the wider spectrum of fault-tolerance strategies. 3. Adaptability: How good the employed fault-tolerance method is in dealing with the inevitable changes characterizing the system and its run-time environment, including the dynamics of faults that manifest themselves at service time. Finally, this chapter also defines a few fundamental fault-tolerance services, namely watchdog timers, exception handling, transactions, and checkpointingand- rollback.


2021 ◽  
Author(s):  
Raha Abedi

One of the main goals of fault injection techniques is to evaluate the fault tolerance of a design. To have greater confidence in the fault tolerance of a system, an accurate fault model is essential. While more accurate than gate level, transistor level fault models cannot be synthesized into FPGA chips. Thus, transistor level faults must be mapped to the gate level to obtain both accuracy and synthesizability. Re-synthesizing a large system for fault injection is not cost effective when the number of faults and system complexity are high. Therefore, the system must be divided into partitions to reduce the re-synthesis time as faults are injected only into a portion of the system. However, the module-based partial reconfiguration complexity rises with an increase in the total number of partitions in the system. An unbalanced partitioning methodology is introduced to reduce the total number of partitions in a system while the size of the partitions where faults are to be injected remains small enough to achieve an acceptable re-synthesis time.


Author(s):  
Vince Molnár ◽  
István Majzik

Failure Mode and Effects Analysis (FMEA) is a systematic technique to explore the possible failure modes of individual components or subsystems and determine their potential effects at the system level. Applications of FMEA are common in case of hardware and communication failures, but analyzing software failures (SW-FMEA) poses a number of challenges. Failures may originate in permanent software faults commonly called bugs, and their effects can be very subtle and hard to predict, due to the complex nature of programs. Therefore, a behavior-based automatic method to analyze the potential effects of different types of bugs is desirable. Such a method could be used to automatically build an FMEA report about the fault effects, or to evaluate different failure mitigation and detection techniques. This paper follows the latter direction, demonstrating the use of a model checking-based automated SW-FMEA approach to evaluate error detection and fault tolerance mechanisms, demonstrated on a case study inspired by safety-critical embedded operating systems.


2012 ◽  
Vol 546-547 ◽  
pp. 1574-1579
Author(s):  
Zhi Wen Xiong ◽  
Wen Feng Wang ◽  
Hong Zeng

Fault tolerant is one of major requirements for embedded systems. As the embedded systems become more and more complex, more chances for various fault. When design embedded system developer has to handle these faults. Before handling faults designer has to identify and understand the types and nature of faults.Faults is the sources for low dependability, faults can be hardware and software. Hardware faults can be distinguished from systematic faults like software or design errors. The Fault can be deleted, such as extensive testing or formal verification and tolerated by fault tolerance techniques. We restrict ourselves to the problem of fault tolerance and refer to other methods for troubleshooting.This paper discusses a new design method about the fault tolerant system of embedded system. We designed a fault tolerant system of data acquisition system in dynamically re-configurable FPGA. The experiment results show that the system not only be able to higher self-adaptive ability and reliability, but also can Through the FGPA to complete a specific algorithm.


2021 ◽  
Author(s):  
Raha Abedi

One of the main goals of fault injection techniques is to evaluate the fault tolerance of a design. To have greater confidence in the fault tolerance of a system, an accurate fault model is essential. While more accurate than gate level, transistor level fault models cannot be synthesized into FPGA chips. Thus, transistor level faults must be mapped to the gate level to obtain both accuracy and synthesizability. Re-synthesizing a large system for fault injection is not cost effective when the number of faults and system complexity are high. Therefore, the system must be divided into partitions to reduce the re-synthesis time as faults are injected only into a portion of the system. However, the module-based partial reconfiguration complexity rises with an increase in the total number of partitions in the system. An unbalanced partitioning methodology is introduced to reduce the total number of partitions in a system while the size of the partitions where faults are to be injected remains small enough to achieve an acceptable re-synthesis time.


Author(s):  
Peter Marwedel

AbstractIn order to cope with the complexity of applications of embedded systems, reuse of components is a key technique. As pointed out by Sangiovanni-Vincentelli (The context for platform-based design. IEEE Design and Test of Computers, 2002), software and hardware components must be reused in the platform-based design methosdology (see p. 296). These components comprise knowledge from earlier design efforts and constitute intellectual property (IP). Standard software components that can be reused include system software components such as embedded operating systems (OSs) and middleware. The last term denotes software that provides an intermediate layer between the OS and application software. This chapter starts with a description of general requirements for embedded operating systems. This includes real-time capabilities as well as adaptation techniques to provide just the required functionality. Mutually exclusive access to resources can result in priority inversion, which is a serious problem for real-time systems. Priority inversion can be circumvented with resource access protocols. We will present three such protocols: the priority inheritance, priority ceiling, and stack resource protocols. A separate section covers the ERIKA real-time system kernel. Furthermore, we will explain how Linux can be adapted to systems with tight resource constraints. Finally, we will provide pointers for additional reusable software components, like hardware abstraction layers (HALs), communication software, and real-time data bases. Our description of embedded operating systems and of middleware in this chapter is consistent with the overall design flow.


2014 ◽  
Vol 23 (3) ◽  
pp. 132-139 ◽  
Author(s):  
Lauren Zubow ◽  
Richard Hurtig

Children with Rett Syndrome (RS) are reported to use multiple modalities to communicate although their intentionality is often questioned (Bartolotta, Zipp, Simpkins, & Glazewski, 2011; Hetzroni & Rubin, 2006; Sigafoos et al., 2000; Sigafoos, Woodyatt, Tuckeer, Roberts-Pennell, & Pittendreigh, 2000). This paper will present results of a study analyzing the unconventional vocalizations of a child with RS. The primary research question addresses the ability of familiar and unfamiliar listeners to interpret unconventional vocalizations as “yes” or “no” responses. This paper will also address the acoustic analysis and perceptual judgments of these vocalizations. Pre-recorded isolated vocalizations of “yes” and “no” were presented to 5 listeners (mother, father, 1 unfamiliar, and 2 familiar clinicians) and the listeners were asked to rate the vocalizations as either “yes” or “no.” The ratings were compared to the original identification made by the child's mother during the face-to-face interaction from which the samples were drawn. Findings of this study suggest, in this case, the child's vocalizations were intentional and could be interpreted by familiar and unfamiliar listeners as either “yes” or “no” without contextual or visual cues. The results suggest that communication partners should be trained to attend to eye-gaze and vocalizations to ensure the child's intended choice is accurately understood.


1999 ◽  
Vol 15 (2) ◽  
pp. 91-98 ◽  
Author(s):  
Lutz F. Hornke

Summary: Item parameters for several hundreds of items were estimated based on empirical data from several thousands of subjects. The logistic one-parameter (1PL) and two-parameter (2PL) model estimates were evaluated. However, model fit showed that only a subset of items complied sufficiently, so that the remaining ones were assembled in well-fitting item banks. In several simulation studies 5000 simulated responses were generated in accordance with a computerized adaptive test procedure along with person parameters. A general reliability of .80 or a standard error of measurement of .44 was used as a stopping rule to end CAT testing. We also recorded how often each item was used by all simulees. Person-parameter estimates based on CAT correlated higher than .90 with true values simulated. For all 1PL fitting item banks most simulees used more than 20 items but less than 30 items to reach the pre-set level of measurement error. However, testing based on item banks that complied to the 2PL revealed that, on average, only 10 items were sufficient to end testing at the same measurement error level. Both clearly demonstrate the precision and economy of computerized adaptive testing. Empirical evaluations from everyday uses will show whether these trends will hold up in practice. If so, CAT will become possible and reasonable with some 150 well-calibrated 2PL items.


Sign in / Sign up

Export Citation Format

Share Document