disk failures
Recently Published Documents


TOTAL DOCUMENTS

45
(FIVE YEARS 8)

H-INDEX

8
(FIVE YEARS 1)

2021 ◽  
Vol 30 (4) ◽  
pp. 1-38
Author(s):  
Yingzhe Lyu ◽  
Heng Li ◽  
Mohammed Sayagh ◽  
Zhen Ming (Jack) Jiang ◽  
Ahmed E. Hassan

AIOps (Artificial Intelligence for IT Operations) leverages machine learning models to help practitioners handle the massive data produced during the operations of large-scale systems. However, due to the nature of the operation data, AIOps modeling faces several data splitting-related challenges, such as imbalanced data, data leakage, and concept drift. In this work, we study the data leakage and concept drift challenges in the context of AIOps and evaluate the impact of different modeling decisions on such challenges. Specifically, we perform a case study on two commonly studied AIOps applications: (1) predicting job failures based on trace data from a large-scale cluster environment and (2) predicting disk failures based on disk monitoring data from a large-scale cloud storage environment. First, we observe that the data leakage issue exists in AIOps solutions. Using a time-based splitting of training and validation datasets can significantly reduce such data leakage, making it more appropriate than using a random splitting in the AIOps context. Second, we show that AIOps solutions suffer from concept drift. Periodically updating AIOps models can help mitigate the impact of such concept drift, while the performance benefit and the modeling cost of increasing the update frequency depend largely on the application data and the used models. Our findings encourage future studies and practices on developing AIOps solutions to pay attention to their data-splitting decisions to handle the data leakage and concept drift challenges.


Author(s):  
Alessio Burrello ◽  
Daniele Jahier Pagliari ◽  
Andrea Bartolini ◽  
Luca Benini ◽  
Enrico Macii ◽  
...  

Author(s):  
Qisi Liu ◽  
Liudong Xing

In this paper we model and analyze survivability and vulnerability of a cloud RAID (Redundant Array of Independent Disks) storage system subject to disk faults and cyber-attacks. The cloud RAID survivability is concerned with the system’s ability to function correctly even under the circumstance of hazardous behaviors including disk failures and malicious attacks. The cloud RAID invulnerability is concerned with the system’s ability to function correctly while occupying some state immune to malicious attacks. A continuous-time Markov chains-based method is suggested to perform the disk level survivability and invulnerability analysis. Combinatorial methods are then presented for the cloud RAID system level analysis, which can accommodate both homogeneous (based on binomial coefficients) and heterogeneous (based on multi-valued decision diagrams) disks. A detailed case study on a cloud RAID 5 system is conducted to illustrate the application of the proposed methods. Impacts of different parameters on the disk and system survivability and invulnerability are also investigated through numerical analysis.


2019 ◽  
Vol 13 (6) ◽  
pp. 748-757
Author(s):  
Xin Chen ◽  
Suihua Cai ◽  
Xiao Ma

2019 ◽  
Vol 214 ◽  
pp. 04046
Author(s):  
Dirk Duellmann ◽  
Alfonso Portabales

The EOS deployment at CERN is a core service used for both scientific data processing, analysis and as back-end for general end-user storage (eg home directories/CERNBOX). The collected disk failure metrics over a period of 1 year from a deployment size of some 70k disks allows a first systematic analysis of the behaviour of different hard disk types for the large CERN usecases. In this contribution we describe the data collection and analysis, summarise the measured rates and compare them with other large disk deployments. We further describe initial steps to use the collected failure and SMART metrics to develop a machine learning model predicting imminent failures and hence avoid service degradation and repair costs.


IEEE Access ◽  
2019 ◽  
Vol 7 ◽  
pp. 114285-114296
Author(s):  
Xin Gao ◽  
Sen Zha ◽  
Xinpeng Li ◽  
Bo Yan ◽  
Xiao Jing ◽  
...  

Author(s):  
Zhi Qiao ◽  
Jacob Hochstetler ◽  
Shuwen Liang ◽  
Song Fu ◽  
Hsing-bung Chen ◽  
...  

Sign in / Sign up

Export Citation Format

Share Document