Impact of Hardware/Software Faults on System Reliability. Volume 2. Procedures for Use of Methodology.

Author(s):  
Edward C. Soistman ◽  
Katherine B. Ragsdale
2015 ◽  
Vol 25 (09n10) ◽  
pp. 1491-1513 ◽  
Author(s):  
Jean Rahme ◽  
Haiping Xu

Correctly measuring the reliability and availability of a cloud-based system is critical for evaluating its system performance. Due to the promised high reliability of physical facilities provided for cloud services, software faults have become one of the major factors for the failures of cloud-based systems. In this paper, we focus on the software aging phenomenon where system performance may be progressively degraded due to exhaustion of system resources, fragmentation and accumulation of errors. We use a proactive technique, called software rejuvenation, to counteract the software aging problem. The dynamic fault tree (DFT) formalism is adopted to model the system reliability before and during a software rejuvenation process in an aging cloud-based system. A novel analytical approach is presented to derive the reliability function of a cloud-based Hot SPare (HSP) gate, which is further verified using Continuous Time Markov Chains (CTMC) for its correctness. We use a case study of a cloud-based system to illustrate the validity of our approach. Based on the reliability analytical results, we show how cost-effective software rejuvenation schedules can be created to keep the system reliability consistently staying above a predefined critical level.


2006 ◽  
Author(s):  
Elizabeth T. Newlin ◽  
Ernesto A. Bustamante ◽  
James P. Bliss ◽  
Randall D. Spain ◽  
Corey K. Fallon

2011 ◽  
Vol 70 (9) ◽  
pp. 803-807
Author(s):  
I. Yu. Lysanov ◽  
A. N. Zbinyakov ◽  
Yu. N. Belikov ◽  
I. S. Zakharov ◽  
V. M. Radygin

Sign in / Sign up

Export Citation Format

Share Document