Transient analysis of computing system with reboot and recovery delay

2020 ◽  
Vol 37 (6/7) ◽  
pp. 983-1005
Author(s):  
Chandra Shekhar ◽  
Amit Gupta ◽  
Madhu Jain ◽  
Neeraj Kumar

PurposeThe purpose of this paper is to present a sensitivity analysis of fault-tolerant redundant repairable computing systems with imperfect coverage, reboot and recovery process.Design/methodology/approachIn this investigation, the authors consider the computing system having a finite number of identical working units functioning simultaneously with the provision of standby units. Working and standby units are prone to random failure in nature and are administered by unreliable software, which is also likely to unpredictable failure. The redundant repairable computing system is modeled as a Markovian machine interference problem with exponentially distributed failure rates and service rates. To excerpt the failed unit from the computing system, the system either opts randomized reboot process or leads to recovery delay.FindingsTransient-state probabilities have been determined with which the authors develop various reliability measures, namely reliability/availability, mean time to failure, failure frequency, and so on, and queueing characteristics, namely expected number of failed units, the throughput of the system and so on, for the predictive purpose. To spectacle the practicability of the developed model, a numerical simulation, sensitivity analysis and so on for different parameters have also been done, and the results are summarized in the tables and graphs. The transient results are helpful to analyze the developing model of the system before having the stability of the system. The derived measures give direct insights into parametric decision-making.Social implicationsThe conclusion has been drawn, and future scope is remarked. The present research study would help system analyst and system designer to make a better choice/decision in order to have the economical design and strategy based on the desired mean time to failure, reliability/availability of the systems and other queueing characteristics.Originality/valueDifferent from previous investigations, this studied model provides a more accurate assessment of the computing system compared to uncertain environments based on sensitivity analysis.

2017 ◽  
Vol 34 (6) ◽  
pp. 770-784 ◽  
Author(s):  
Nupur Goyal ◽  
Mangey Ram ◽  
Shubham Amoli ◽  
Alok Suyal

Purpose The purpose of this paper is to investigate the reliability measures, namely, availability, reliability, mean time to failure and expected profit. The authors also analyse the sensitivity of these reliability measures. Design/methodology/approach Depending upon the real industrial relevance, a generalized system which is easily repairable, extremely reliable and of high quality is expected by the rapid growth of the digital economy. Considering reliability, as one of the performance measure, the authors have designed a complex system which consists of three subsystems, namely, A, B and C in series configuration. The subsystem A consists of n numbers of units which are arranged in parallel configuration, subsystem B consists of two sub-subsystems X and Y align parallel to one another, where X is a type of 1-out-of-n:F. Failure and repair rates are assumed to be follow the general distribution. Findings The system is deeply studied by the usage of the supplementary variable technique, Laplace transformation and Markov’s law. Various conclusive results such as availability and reliability of the system, mean time to failure, cost and sensitivity analysis have been discussed further. Originality/value Through the systematic view of reliability measures of the proposed system, performance of the system can be enhanced under high profit.


Author(s):  
Chandra Shekhar ◽  
Neeraj Kumar ◽  
Madhu Jain ◽  
Amit Gupta

In this paper, we investigate the reliability and queueing performance indices for the fault-tolerant computing network having a finite number of unreliable operating components with the provision of warm standby components. Operating and standby components are governed by dedicated software which is also prone to random failure. On failure of operating components, available standby component(s) may switch from the standby state to operating state with negligible switchover time. The switchover process may also fail due to some automation hindrance. The computing network is also subjected to common cause failure in lieu of external cause. The studied redundant fault-tolerant computing network is framed as a Markovian machine interference model with exponentially distributed inter-failure times and service times. For the reliability prediction of the computing network, various performance measures, namely, mean-time-to-failure (MTTF), reliability/availability, failure frequency, etc., have been formulated in terms of transient-state probabilities which we have obtained using the spectral method. To show the practicability of the developed model, numerical simulation has been done. Sensitivity analysis of reliability and other indices of the computing network with respect to different network parameters has been presented, and results are summarized in the tables and graphs. Finally, future scope and concluding remarks have been included.


2020 ◽  
Vol 37 (4) ◽  
pp. 517-537
Author(s):  
Nisha Nautiyal ◽  
S.B. Singh ◽  
Soni Bisht

PurposeThe present paper focuses on the evaluation of reliability and its characteristics (Mean time to failure and Sensitivity) of a k-out-of-n network.Design/methodology/approachThe minimal cuts of the network have been evaluated for different nodes in this paper, using an algorithm. With the help of these cuts, reliability and its characteristics are obtained using Gumbel–Hougaard family of the copula.FindingsThe present paper proposes to compute the reliability and its measures of the k-out-of-n network using the minimal cuts and copula methodology. The completely failed nodes of the network have been repaired using Gumbel–Hougaard family of the copula.Originality/valueIn this paper, the reliability of a k-out-of-n network has been evaluated by first calculating k-out-of-n minimal cuts, and the failed nodes have also been repaired using Gumbel–Hougaard family of the copula, unlike as done in the past.


Author(s):  
SWAPNA S. GOKHALE

Architecture-based techniques for reliability assessment of software applications have received increased attention in the past few years due to the advent of component-based software development paradigm. Most of the prior research efforts in architecture-based analysis use the composite solution approach to solve the architecture-based models in order to estimate application reliability. Though the composite solution approach produces an accurate estimate of application reliability, it suffers from several drawbacks. The most notable drawback of the composite solution approach is that it does not allow an analysis of the sensitivity of the application reliability to the reliabilities of the components comprising the application and the application structure. The hierarchical solution approach on the other hand, has the potential of overcoming the drawbacks of the composite approach. However, in the present form, the hierarchical solution approach produces an estimate of application reliability which is only an approximation of the estimate produced by the composite approach since it does not take into consideration the second-order architectural statistics. Also, although the hierarchical solution approach can be used for sensitivity analysis, mathematical techniques to perform such analysis are lacking. Development of an accurate hierarchical solution approach to estimate application reliability based on its architecture is the focus of this paper. Using the approach described in this paper, an analytical application reliability function which incorporates second-order architectural statistics can be obtained. Sensitivity analysis techniques and expressions to determine the mean time to failure of the application are developed based on this analytical reliability function. We illustrate the reliability prediction, sensitivity analysis, and mean time to failure computation techniques presented in this paper using two case studies.


2020 ◽  
Vol ahead-of-print (ahead-of-print) ◽  
Author(s):  
Anil Kr. Aggarwal

PurposeThis paper deals with the performance optimization and sensitivity analysis for crystallization system of a sugar plant.Design/methodology/approachCrystallization system comprises of five subsystems, namely crystallizer, centrifugal pump and sugar grader. The Chapman–Kolmogorov differential equations are derived from the transition diagram of the crystallization system using mnemonic rule. These equations are solved to compute reliability and steady state availability by putting the appropriate combinations of failure and repair rates using normalizing and initial boundary conditions. The performance optimization is carried out by varying number of generations, population size, crossover and mutation probabilities. Finally, sensitivity analysis is performed to analyze the effect of change in failure rates of each subsystem on availability, mean time to failure (MTBF) and mean time to repair (MTTR).FindingsThe highest performance observed is 96.95% at crossover probability of 0.3 and sugar grader subsystem comes out to be the most critical and sensitive subsystem.Originality/valueThe findings of the paper highlights the optimum value of performance level at failure and repair rates for subsystems and also helps identify the most sensitive subsystem. These findings are highly beneficial for the maintenance personnel of the plant to plan the maintenance strategies accordingly.


2021 ◽  
Vol 58 (2) ◽  
pp. 289-313
Author(s):  
Ruhul Ali Khan ◽  
Dhrubasish Bhattacharyya ◽  
Murari Mitra

AbstractThe performance and effectiveness of an age replacement policy can be assessed by its mean time to failure (MTTF) function. We develop shock model theory in different scenarios for classes of life distributions based on the MTTF function where the probabilities $\bar{P}_k$ of surviving the first k shocks are assumed to have discrete DMTTF, IMTTF and IDMTTF properties. The cumulative damage model of A-Hameed and Proschan [1] is studied in this context and analogous results are established. Weak convergence and moment convergence issues within the IDMTTF class of life distributions are explored. The preservation of the IDMTTF property under some basic reliability operations is also investigated. Finally we show that the intersection of IDMRL and IDMTTF classes contains the BFR family and establish results outlining the positions of various non-monotonic ageing classes in the hierarchy.


2011 ◽  
Vol 110-116 ◽  
pp. 2497-2503 ◽  
Author(s):  
Zdenek Vintr ◽  
Michal Vintr

Rolling bearings are usually considered to be non-repaired items the reliability of which is characterized by mean time to failure, or so called basic rating life. Reliability describes these parameters well in case the bearings are used in operation up to the very time the failure occurs, or during the time corresponding with basic rating life. In case of railway applications the bearings are often used in large groups and are preventively replaced after much shorter operating time as compared with their basic rating life. In the article there is a model which enables us to describe the bearings reliability in this specific case and to specify a number of failures which might be expected from a group of bearings during operating time, or to determine mean operating time between failures of bearings.


2021 ◽  
Author(s):  
Lavanya Vadamodala ◽  
Abdul Wahab Bandarkar ◽  
Shuvajit Das ◽  
Md Ehsanul Haque ◽  
Anik Chowdhury ◽  
...  

2018 ◽  
Author(s):  
Fahad Al Adi ◽  
Afrinaldi Zulhen ◽  
Masrisetyo Adi ◽  
Hassan Al Saadi ◽  
Miguel Marcano ◽  
...  

2008 ◽  
Vol 7 (4) ◽  
pp. 307-326
Author(s):  
Zimoch Izabela

Reliability Analysis of Water Distribution Subsystem This paper presents results of detailed reliability analysis of water distribution subsystem operation of Krakow city. Basis of the research was wide base of information of occurred failures during exploitation (1996-2006). These analysis included evaluation of basic factors such as: failure and renovation intensities, mean recovery time and mean time to failure, availability factor and probability of failure-free operation at any time. Moreover, it was performed wide analysis of failure capability of pipes as a function of its diameter and material. The paper consists also of research results of occurred piping failures reasons and consequences.


Sign in / Sign up

Export Citation Format

Share Document