Hardware Failure | ScienceGate

The Electric Power Research Institute (EPRI) is conducting research in cooperation with the Nuclear Energy Institute (NEI) regarding Operating Experience of digital Instrumentation and Control (I&C) systems in US nuclear power plants. The primary objective of this work is to extract insights from US nuclear power plant Operating Experience (OE) reports that can be applied to improve Diversity and Defense in Depth (D3) evaluations and methods for protecting nuclear plants against I&C related Common Cause Failures (CCF) that could disable safety functions and thereby degrade plant safety. Between 1987 and 2007, over 500 OE events involving digital equipment in US nuclear power plants were reported through various channels. OE reports for 324 of these events were found in databases maintained by the Nuclear Regulatory Commission (NRC) and the Institute of Nuclear Power Operations (INPO). A database was prepared for capturing the characteristics of each of the 324 events in terms of when, where, how, and why the event occurred, what steps were taken to correct the deficiency that caused the event, and what defensive measures could have been employed to prevent recurrence of these events. The database also captures the plant system type, its safety classification, and whether or not the event involved a common cause failure. This work has revealed the following results and insights: - 82 of the 324 “digital” events did not actually involve a digital failure. Of these 82 non-digital events, 34 might have been prevented by making full use of digital system fault tolerance features. - 242 of the 324 events did involve failures in digital systems. The leading contributors to the 242 digital failures were hardware failure modes. Software change appears as a corrective action twice as often as it appears as an event root cause. This suggests that software features are being added to avoid recurrence of hardware failures, and that adequately designed software is a strong defensive measure against hardware failure modes, preventing them from propagating into system failures and ultimately plant events. 54 of the 242 digital failures involved a Common Cause Failure (CCF). - 13 of the 54 CCF events affected safety (1E) systems, and only 2 of those were due to Inadequate Software Design. This finding suggests that software related CCFs on 1E systems are no more prevalent than other CCF mechanisms for which adherence to various regulations and standards is considered to provide adequate protection against CCF. This research provides an extensive data set that is being used to investigate many different questions related to failure modes, causes, corrective actions, and other event attributes that can be compared and contrasted to reveal useful insights. Specific considerations in this study included comparison of 1E vs. non-1E systems, active vs. potential CCFs, and possible defensive measures to prevent these events. This paper documents the dominant attributes of the evaluated events and the associated insights that can be used to improve methods for protecting against digital I&C related CCFs, applying a test of reasonable assurance.

Download Full-text

Hardware Reliability Analysis of a Coal Mine Gas Monitoring System Based on Fuzzy-FTA

Applied Sciences ◽

10.3390/app112210616 ◽

2021 ◽

Vol 11 (22) ◽

pp. 10616

Author(s):

Jingtian Xu ◽

Man Yang ◽

Shugang Li

Keyword(s):

Reliability Analysis ◽

Monitoring System ◽

Fault Tree ◽

Fault Tree Analysis ◽

Gas Monitoring ◽

Hardware Failure ◽

Minimum Path ◽

Fuzzy Fault Tree Analysis ◽

Hardware Reliability ◽

Tree Method

The hardware reliability of a gas monitoring system was investigated using the fuzzy fault tree analysis method. A fault tree was developed considering the hardware failure of the gas monitoring system as a top event. Two minimum path sets were achieved through qualitative analysis using the ascending method. The concept of fuzzy number of the fuzzy set theory was applied to describe the probability of basic event occurrence in the fault tree, and the fuzzy failure probabilities of the middle and top events were calculated using fuzzy AND and OR operators. The results show that the proposed fuzzy fault tree is an effective method of reliability analysis for gas monitoring systems. Results of calculations using this method are more reasonable than those obtained with the conventional fault tree method.

Download Full-text

Fault Tolerant Control Design for Feedwater System

10.32920/ryerson.14653386.v1 ◽

2021 ◽

Author(s):

Mohammed Eltayb

Keyword(s):

Controller Design ◽

Fault Tolerant ◽

Control Valve ◽

Fault Tolerant Control ◽

Control Valves ◽

Hardware Failure ◽

Water Storage Tank ◽

Level Sensor ◽

On Line ◽

Hardware Failures

Fault tolerant control (FTC) is essential nowadays in the automation industry. It provides a means for higher equipment availability. Fault in dynamical systems can occur due to the deviation of the system parameters from the normal operating range. Alternatively, it can be a structural change from the normal situation of continuous operation such as the blocking of an actuator due to the mechanical stiction. In this research project, a fault tolerant controller is designed with Matlab Simulink for a feedwater system. The feedwater system components are modified to work under embedded controller design with FTC attached to it. Feedwater systems usually consist of a de-aerator or simply a water storage tank, feedwater pumps, control valves, piping and support fitting elements such as chock valves, anges, hoses and relief valves, beside instrumentation devices like level transmitters, flow transmitters, pressure regulators. The faults are injected separately for each device. Fault diagnostic is used to detect and identify the faults by Limit-checking method. Then a controller is reconfigured to take the action of correcting the hardware failures in the control valve, level sensor, and feedwater pump. The simulation results revealed that the redundant components can take over and handle the process operation when the fault occurs at the duty components. Level sensors are set to work in on-line mode, while the control valves are set to work in off-line mode, due to the mechanical parts movement. Setting the control valves in on-line mode reduces the probability of valve stiction and elongates the component availability. The results reveal the operation of feedwater system is not stopped when a hardware failure takes place in all feedwater system major components. Moreover, the disturbances are not considered in this research as there are many control techniques that can be used to handle the disturbance in a robust way.

Download Full-text

Minimally Invasive Circumferential Spinal Decompression and Stabilization for Symptomatic Metastatic Spine Tumor

Neurosurgery ◽

10.1227/01.neu.0000365270.23815.b1 ◽

2010 ◽

Vol 66 (3) ◽

pp. E620-E622 ◽

Cited By ~ 22

Author(s):

Alexander Taghva ◽

Khan W. Li ◽

John C. Liu ◽

Ziya L. Gokaslan ◽

Patrick C. Hsieh

Keyword(s):

Spinal Cord ◽

Minimally Invasive ◽

Spinal Cord Compression ◽

Surgical Techniques ◽

Cord Compression ◽

Pedicle Screw Fixation ◽

Lower Extremity Weakness ◽

Hardware Failure ◽

Minimally Invasive Surgical ◽

Potential Benefits

Abstract OBJECTIVE Metastatic epidural spinal cord compression is a potentially devastating complication of cancer and is estimated to occur in 5% to 14% of all cancer patients. It is best treated surgically. Minimally invasive spine surgery has the potential benefits of decreased surgical approach–related morbidity, blood loss, hospital stay, and time to mobilization. CLINICAL PRESENTATION A 36-year-old man presented with worsening back pain and lower extremity weakness. Workup revealed metastatic adenocarcinoma of the lung with spinal cord compression at T4 and T5. INTERVENTION AND TECHNIQUE T4 and T5 vertebrectomy with expandable cage placement and T1–T8 pedicle screw fixation and fusion were performed using minimally invasive surgical techniques. RESULT The patient improved neurologically and was ambulatory on postoperative day 1. At the 9-month follow-up point, he remained neurologically intact and pain free, and there was no evidence of hardware failure. CONCLUSION Minimally invasive surgical circumferential decompression may be a viable option for the treatment of metastatic epidural spinal cord compression.

Download Full-text

Exploring the feasibility of lossy compression for PDE simulations

The International Journal of High Performance Computing Applications ◽

10.1177/1094342018762036 ◽

2018 ◽

Vol 33 (2) ◽

pp. 397-410 ◽

Cited By ~ 8

Author(s):

Jon Calhoun ◽

Franck Cappello ◽

Luke N Olson ◽

Marc Snir ◽

William D Gropp

Keyword(s):

Energy Cost ◽

High Performance ◽

Truncation Error ◽

Computational Cost ◽

Lossy Compression ◽

Data Movement ◽

Hardware Failure ◽

Traditional Approaches ◽

Application Codes ◽

Performance Computing

Checkpoint restart plays an important role in high-performance computing (HPC) applications, allowing simulation runtime to extend beyond a single job allocation and facilitating recovery from hardware failure. Yet, as machines grow in size and in complexity, traditional approaches to checkpoint restart are becoming prohibitive. Current methods store a subset of the application’s state and exploit the memory hierarchy in the machine. However, as the energy cost of data movement continues to dominate, further reductions in checkpoint size are needed. Lossy compression, which can significantly reduce checkpoint sizes, offers a potential to reduce computational cost in checkpoint restart. This article investigates the use of numerical properties of partial differential equation (PDE) simulations, such as bounds on the truncation error, to evaluate the feasibility of using lossy compression in checkpointing PDE simulations. Restart from a checkpoint with lossy compression is considered for a fail-stop error in two time-dependent HPC application codes: PlasComCM and Nek5000. Results show that error in application variables due to a restart from a lossy compressed checkpoint can be masked by the numerical error in the discretization, leading to increased efficiency in checkpoint restart without influencing overall accuracy in the simulation.

Download Full-text