A Method to Support Fault Tolerance Design in Service Oriented Computing Systems

Author(s):  
Domenico Cotroneo ◽  
Antonio Pecchia ◽  
Roberto Pietrantuono ◽  
Stefano Russo

Service Oriented Computing relies on the integration of heterogeneous software technologies and infrastructures that provide developers with a common ground for composing services and producing applications flexibly. However, this approach eases software development but makes dependability a big challenge. Integrating such diverse software items raise issues that traditional testing is not able to exhaustively cope with. In this context, tolerating faults, rather than attempt to detect them solely by testing, is a more suitable solution. This paper proposes a method to support a tailored design of fault tolerance actions for the system being developed. This paper describes system failure behavior through an extensive fault injection campaign to figure out its criticalities and adopt the most appropriate countermeasures to tolerate operational faults. The proposed method is applied to two distinct SOC-enabling technologies. Results show how the achieved findings allow designers to understand the system failure behavior and plan fault tolerance.

Author(s):  
Domenico Cotroneo ◽  
Antonio Pecchia ◽  
Roberto Pietrantuono ◽  
Stefano Russo

Service Oriented Computing relies on the integration of heterogeneous software technologies and infrastructures that provide developers with a common ground for composing services and producing applications flexibly. However, this approach eases software development but makes dependability a big challenge. Integrating such diverse software items raise issues that traditional testing is not able to exhaustively cope with. In this context, tolerating faults, rather than attempt to detect them solely by testing, is a more suitable solution. This paper proposes a method to support a tailored design of fault tolerance actions for the system being developed. This paper describes system failure behavior through an extensive fault injection campaign to figure out its criticalities and adopt the most appropriate countermeasures to tolerate operational faults. The proposed method is applied to two distinct SOC-enabling technologies. Results show how the achieved findings allow designers to understand the system failure behavior and plan fault tolerance.


2021 ◽  
Vol 118 ◽  
pp. 252-256
Author(s):  
Sami Yangui ◽  
Andrzej Goscinski ◽  
Khalil Drira ◽  
Zahir Tari ◽  
Djamal Benslimane

Electronics ◽  
2021 ◽  
Vol 10 (10) ◽  
pp. 1179
Author(s):  
Jonatan Sánchez ◽  
Antonio da Silva ◽  
Pablo Parra ◽  
Óscar R. Polo ◽  
Agustín Martínez Hellín ◽  
...  

Multicore hardware platforms are being incorporated into spacecraft on-board systems to achieve faster and more efficient data processing. However, such systems lead to increased complexity in software development and represent a considerable challenge, especially concerning the runtime verification of fault-tolerance requirements. To address the ever-challenging verification of this kind of requirement, we introduce a LEON4 multicore virtual platform called LeonViP-MC. LeonViP-MC is an evolution of a previous development called Leon2ViP, carried out by the Space Research Group of the University of Alcalá (SRG-UAH), which has been successfully used in the development and testing of the flight software of the instrument control unit (ICU) of the energetic particle detector (EPD) on board the Solar Orbiter. This paper describes the LeonViP-MC architectural design decisions oriented towards fault-injection campaigns to verify software fault-tolerance mechanisms. To validate the simulator, we developed an ARINC653 communications channel that incorporates fault-tolerance mechanisms and is currently being used to develop a hypervisor level for the GR740 platform.


Sign in / Sign up

Export Citation Format

Share Document