hardware failures
Recently Published Documents


TOTAL DOCUMENTS

103
(FIVE YEARS 41)

H-INDEX

12
(FIVE YEARS 2)

2021 ◽  
Vol 4 ◽  
Author(s):  
Dragi Kimovski ◽  
Roland Mathá ◽  
Gabriel Iuhasz ◽  
Fabrizio Marozzo ◽  
Dana Petcu ◽  
...  

The execution of complex distributed applications in exascale systems faces many challenges, as it involves empirical evaluation of countless code variations and application runtime parameters over a heterogeneous set of resources. To mitigate these challenges, the research field of autotuning has gained momentum. The autotuning automates identifying the most desirable application implementation in terms of code variations and runtime parameters. However, the complexity and size of the exascale systems make the autotuning process very difficult, especially considering the number of parameter variations that have to be identified. Therefore, we introduce a novel approach for autotuning exascale applications based on a genetic multi-objective optimization algorithm integrated within the ASPIDE exascale computing framework. The approach considers multi-dimensional search space with support for pluggable objective functions, including execution time and energy requirements. Furthermore, the autotuner employs a machine learning-based event detection approach to detect events and anomalies during application execution, such as hardware failures or communication bottlenecks.


Sensors ◽  
2021 ◽  
Vol 21 (22) ◽  
pp. 7683
Author(s):  
Marcantonio Catelani ◽  
Lorenzo Ciani ◽  
Alessandro Bartolini ◽  
Cristiano Del Rio ◽  
Giulia Guidi ◽  
...  

Wireless Sensor Networks are subjected to some design constraints (e.g., processing capability, storage memory, energy consumption, fixed deployment, etc.) and to outdoor harsh conditions that deeply affect the network reliability. The aim of this work is to provide a deeper understanding about the way redundancy and node deployment affect the network reliability. In more detail, the paper analyzes the design and implementation of a wireless sensor network for low-power and low-cost applications and calculates its reliability considering the real environmental conditions and the real arrangement of the nodes deployed in the field. The reliability of the system has been evaluated by looking for both hardware failures and communication errors. A reliability prediction based on different handbooks has been carried out to estimate the failure rate of the nodes self-designed and self-developed to be used under harsh environments. Then, using the Fault Tree Analysis the real deployment of the nodes is taken into account considering the Wi-Fi coverage area and the possible communication link between nearby nodes. The findings show how different node arrangements provide significantly different reliability. The positioning is therefore essential in order to obtain maximum performance from a Wireless sensor network.


2021 ◽  
pp. 70-72
Author(s):  
A. V. Sidorov ◽  
V. A. Sidorov

The first book on a promising new direction — methodology for equipment failure management for technical managers and specialists of industrial enterprises, as well as for all those whose field of expertise includes the need to deal with equipment failures — has been developed and published under the auspices of the Association for Effective Enterprise Asset Management (EAM Association).


Author(s):  
Ariana Moura Cabral ◽  
Adriano Alves Pereira ◽  
Marcus Fraga Vieira ◽  
Bruno Lima Pessôa ◽  
Adriano de Oliveira Andrade

Sensors ◽  
2021 ◽  
Vol 21 (19) ◽  
pp. 6516
Author(s):  
Simon Schmidt ◽  
Jens Oberrath ◽  
Paolo Mercorelli

DC-DC converters are widely used in a large number of power conversion applications. As in many other systems, they are designed to automatically prevent dangerous failures or control them when they arise; this is called functional safety. Therefore, random hardware failures such as sensor faults have to be detected and handled properly. This proper handling means achieving or maintaining a safe state according to ISO 26262. However, to achieve or maintain a safe state, a fault has to be detected first. Sensor faults within DC-DC converters are generally detected with hardware-redundant sensors, despite all their drawbacks. Within this article, this redundancy is addressed using observer-based techniques utilizing Extended Kalman Filters (EKFs). Moreover, the paper proposes a fault detection and isolation scheme to guarantee functional safety. For this, a cross-EKF structure is implemented to work in cross-parallel to the real sensors and to replace the sensors in case of a fault. This ensures the continuity of the service in case of sensor faults. This idea is based on the concept of the virtual sensor which replaces the sensor in case of fault. Moreover, the concept of the virtual sensor is broader. In fact, if a system is observable, the observer offers a better performance than the sensor. In this context, this paper gives a contribution in this area. The effectiveness of this approach is tested with measurements on a buck converter prototype.


Author(s):  
Lili E. Schindelar ◽  
Richard M. McEntee ◽  
Robert E. Gallivan ◽  
Brian Katt ◽  
Pedro K. Beredjiklian

Abstract Background Distal radius fractures are one of the most common fractures seen in the elderly. The management of distal radius fractures in the elderly, especially patients older than 80 years, has not been well defined. The purpose of this study was to evaluate operative treatment of distal radius fractures in patients older than 80 years to determine functional outcomes and complication rates. Materials and Methods A retrospective review was performed to identify patients 80 years or older who were treated for a distal radius fracture with open reduction and internal fixation (ORIF). Medical records were reviewed for demographics, medical history, functional outcomes including quick Disabilities of the Arm, Shoulder, and Hand (qDASH), radiographs, and postoperative complications. Results There were 40 patients included for review. Average age was 84 years. The preoperative qDASH score was 69. At 6 months follow-up, the postoperative qDASH score was 13 (p < 0.001). There were five (12.5%) complications reported postoperatively. All fractures healed with adequate radiographic alignment and there were no hardware failures. Conclusion Distal radius fractures in patients older than 80 years treated with ORIF have good functional outcomes and low complication rates. Increased functionality and independence of the elderly, as well as updated implant design can lead to the effective surgical management of these patients. When indicated from a clinical perspective, operative fixation of distal radius fractures should be considered in patients older than 80 years.


Entropy ◽  
2021 ◽  
Vol 23 (8) ◽  
pp. 1011
Author(s):  
Iman Kohyarnejadfard ◽  
Daniel Aloise ◽  
Michel R. Dagenais ◽  
Mahsa Shakeri

Advances in technology and computing power have led to the emergence of complex and large-scale software architectures in recent years. However, they are prone to performance anomalies due to various reasons, including software bugs, hardware failures, and resource contentions. Performance metrics represent the average load on the system and do not help discover the cause of the problem if abnormal behavior occurs during software execution. Consequently, system experts have to examine a massive amount of low-level tracing data to determine the cause of a performance issue. In this work, we propose an anomaly detection framework that reduces troubleshooting time, besides guiding developers to discover performance problems by highlighting anomalous parts in trace data. Our framework works by collecting streams of system calls during the execution of a process using the Linux Trace Toolkit Next Generation(LTTng), sending them to a machine learning module that reveals anomalous subsequences of system calls based on their execution times and frequency. Extensive experiments on real datasets from two different applications (e.g., MySQL and Chrome), for varying scenarios in terms of available labeled data, demonstrate the effectiveness of our approach to distinguish normal sequences from abnormal ones.


2021 ◽  
Vol 11 (14) ◽  
pp. 6335
Author(s):  
Yifan Li ◽  
Hong-Zhong Huang ◽  
Tingyu Zhang

Hard-and-software integrated systems such as command and control systems (C4ISR systems) are typical systems that are comprised of both software and hardware, the failures of such devices result from complicated common cause failures and common (or shared) signals that make classical reliability analysis methods will be not applicable. To this end, this paper applies the Goal-Oriented (GO) methodology to detailed analyze the reliability of a C4ISR system. The reliability as well as the failure probability of the C4ISR system, are reached based on the GO model constructed. At the component level, the reliability of units of the C4ISR system is computed. Importance analysis of failures of such a system is completed by the qualitative analysis capability of the GO model, by which critical failures of hardware failures like communication module failures and motherboard module failures as well as software failures like network module application software failures and decompression module software failures are ascertained. This method of this paper contributes to the reliability analysis of all hard-and-software integrated systems.


Author(s):  
Nupur Goyal ◽  
Tanuja Joshi ◽  
Mangey Ram

Content Delivery Networks (CDN) are the backbone of Internet. A lot of research has been done to make CDNs more reliable. Despite that, the world has suffered from CDN inefficiencies quite a few times, not just due to external hacking attempts but due to internal failures as well. In this research work the authors have analyzed the performance of a content delivery network through various reliability measures. Considering a basic CDN workflow they have calculated the reliability and availability of the proposed multi-state system using Markov process and Laplace transformation. Software/Hardware failures in any network component can affect the reliability of the whole system. Therefore, the authors have analyzed the obtained results to find major causes of failures in the system, which when avoided, can lead to a faster and more efficient distribution network.


2021 ◽  
Author(s):  
Mohammed Eltayb

Fault tolerant control (FTC) is essential nowadays in the automation industry. It provides a means for higher equipment availability. Fault in dynamical systems can occur due to the deviation of the system parameters from the normal operating range. Alternatively, it can be a structural change from the normal situation of continuous operation such as the blocking of an actuator due to the mechanical stiction. In this research project, a fault tolerant controller is designed with Matlab Simulink for a feedwater system. The feedwater system components are modified to work under embedded controller design with FTC attached to it. Feedwater systems usually consist of a de-aerator or simply a water storage tank, feedwater pumps, control valves, piping and support fitting elements such as chock valves, anges, hoses and relief valves, beside instrumentation devices like level transmitters, flow transmitters, pressure regulators. The faults are injected separately for each device. Fault diagnostic is used to detect and identify the faults by Limit-checking method. Then a controller is reconfigured to take the action of correcting the hardware failures in the control valve, level sensor, and feedwater pump. The simulation results revealed that the redundant components can take over and handle the process operation when the fault occurs at the duty components. Level sensors are set to work in on-line mode, while the control valves are set to work in off-line mode, due to the mechanical parts movement. Setting the control valves in on-line mode reduces the probability of valve stiction and elongates the component availability. The results reveal the operation of feedwater system is not stopped when a hardware failure takes place in all feedwater system major components. Moreover, the disturbances are not considered in this research as there are many control techniques that can be used to handle the disturbance in a robust way.


Sign in / Sign up

Export Citation Format

Share Document