scholarly journals Disk failures in the EOS setup at CERN

2019 ◽  
Vol 214 ◽  
pp. 04046
Author(s):  
Dirk Duellmann ◽  
Alfonso Portabales

The EOS deployment at CERN is a core service used for both scientific data processing, analysis and as back-end for general end-user storage (eg home directories/CERNBOX). The collected disk failure metrics over a period of 1 year from a deployment size of some 70k disks allows a first systematic analysis of the behaviour of different hard disk types for the large CERN usecases. In this contribution we describe the data collection and analysis, summarise the measured rates and compare them with other large disk deployments. We further describe initial steps to use the collected failure and SMART metrics to develop a machine learning model predicting imminent failures and hence avoid service degradation and repair costs.

2018 ◽  
Vol 20 (5) ◽  
pp. 1131-1147 ◽  
Author(s):  
N. Caradot ◽  
M. Riechel ◽  
M. Fesneau ◽  
N. Hernandez ◽  
A. Torres ◽  
...  

Abstract Deterioration models can be successfully deployed only if decision-makers trust the modelling outcomes and are aware of model uncertainties. Our study aims to address this issue by developing a set of clearly understandable metrics to assess the performance of sewer deterioration models from an end-user perspective. The developed metrics are used to benchmark the performance of a statistical model, namely, GompitZ based on survival analysis and Markov-chains, and a machine learning model, namely, Random Forest, an ensemble learning method based on decision trees. The models have been trained with the extensive CCTV dataset of the sewer network of Berlin, Germany (115,258 inspections). At network level, both models give satisfactory outcomes with deviations between predicted and inspected condition distributions below 5%. At pipe level, the statistical model does not perform better than a simple random model, which attributes randomly a condition class to each inspected pipe, whereas the machine learning model provides satisfying performance. 66.7% of the pipes inspected in bad condition have been predicted correctly. The machine learning approach shows a strong potential for supporting operators in the identification of pipes in critical condition for inspection programs whereas the statistical approach is more adapted to support strategic rehabilitation planning.


Author(s):  
Himanshu Bajpai

Providing support on the rolled-out application/services is one of the major factors in increasing the customer satisfaction which in turn increases the customer retention. Since we are in the era of automation where most of the day-to-day jobs are taken care of or are facilitated by the technologies around us, hence there is a need to reduce manual effort in triaging the support tickets and hence facilitating the person on call to better close the tickets on time with proper remediation. The machine learning model which will be the product of this complete paper will not only help in classifying the tickets but also, if applicable will give the best possible remediation of the ticket there by reducing the manual effort and the time taken on providing necessary solution on the ticket. The objectives of the work are as follows - a) Understand the data that is present in the ticket and figure out the basic understanding like, categories of issues, trends etc. b) Prepare the data which is ready for applying different classification algorithms. d) Identify the best machine learning model which can classify the new incident with utmost accuracy. e) Prepare a machine learning model which can suggest the best possible remediation of the ticket. f) Integrate the best classification model and solution recommender model and wrap it as an API which can be used by end user.


2021 ◽  
Vol 12 (1) ◽  
Author(s):  
Ci Fu ◽  
Xiang Zhang ◽  
Amanda O. Veri ◽  
Kali R. Iyer ◽  
Emma Lash ◽  
...  

AbstractFungal pathogens pose a global threat to human health, with Candida albicans among the leading killers. Systematic analysis of essential genes provides a powerful strategy to discover potential antifungal targets. Here, we build a machine learning model to generate genome-wide gene essentiality predictions for C. albicans and expand the largest functional genomics resource in this pathogen (the GRACE collection) by 866 genes. Using this model and chemogenomic analyses, we define the function of three uncharacterized essential genes with roles in kinetochore function, mitochondrial integrity, and translation, and identify the glutaminyl-tRNA synthetase Gln4 as the target of N-pyrimidinyl-β-thiophenylacrylamide (NP-BTA), an antifungal compound.


2018 ◽  
Author(s):  
Steen Lysgaard ◽  
Paul C. Jennings ◽  
Jens Strabo Hummelshøj ◽  
Thomas Bligaard ◽  
Tejs Vegge

A machine learning model is used as a surrogate fitness evaluator in a genetic algorithm (GA) optimization of the atomic distribution of Pt-Au nanoparticles. The machine learning accelerated genetic algorithm (MLaGA) yields a 50-fold reduction of required energy calculations compared to a traditional GA.


2019 ◽  
Author(s):  
Siddhartha Laghuvarapu ◽  
Yashaswi Pathak ◽  
U. Deva Priyakumar

Recent advances in artificial intelligence along with development of large datasets of energies calculated using quantum mechanical (QM)/density functional theory (DFT) methods have enabled prediction of accurate molecular energies at reasonably low computational cost. However, machine learning models that have been reported so far requires the atomic positions obtained from geometry optimizations using high level QM/DFT methods as input in order to predict the energies, and do not allow for geometry optimization. In this paper, a transferable and molecule-size independent machine learning model (BAND NN) based on a chemically intuitive representation inspired by molecular mechanics force fields is presented. The model predicts the atomization energies of equilibrium and non-equilibrium structures as sum of energy contributions from bonds (B), angles (A), nonbonds (N) and dihedrals (D) at remarkable accuracy. The robustness of the proposed model is further validated by calculations that span over the conformational, configurational and reaction space. The transferability of this model on systems larger than the ones in the dataset is demonstrated by performing calculations on select large molecules. Importantly, employing the BAND NN model, it is possible to perform geometry optimizations starting from non-equilibrium structures along with predicting their energies.


2019 ◽  
Vol 15 (3) ◽  
pp. 206-211 ◽  
Author(s):  
Jihui Tang ◽  
Jie Ning ◽  
Xiaoyan Liu ◽  
Baoming Wu ◽  
Rongfeng Hu

<P>Introduction: Machine Learning is a useful tool for the prediction of cell-penetration compounds as drug candidates. </P><P> Materials and Methods: In this study, we developed a novel method for predicting Cell-Penetrating Peptides (CPPs) membrane penetrating capability. For this, we used orthogonal encoding to encode amino acid and each amino acid position as one variable. Then a software of IBM spss modeler and a dataset including 533 CPPs, were used for model screening. </P><P> Results: The results indicated that the machine learning model of Support Vector Machine (SVM) was suitable for predicting membrane penetrating capability. For improvement, the three CPPs with the most longer lengths were used to predict CPPs. The penetration capability can be predicted with an accuracy of close to 95%. </P><P> Conclusion: All the results indicated that by using amino acid position as a variable can be a perspective method for predicting CPPs membrane penetrating capability.</P>


Author(s):  
Dhilsath Fathima.M ◽  
S. Justin Samuel ◽  
R. Hari Haran

Aim: This proposed work is used to develop an improved and robust machine learning model for predicting Myocardial Infarction (MI) could have substantial clinical impact. Objectives: This paper explains how to build machine learning based computer-aided analysis system for an early and accurate prediction of Myocardial Infarction (MI) which utilizes framingham heart study dataset for validation and evaluation. This proposed computer-aided analysis model will support medical professionals to predict myocardial infarction proficiently. Methods: The proposed model utilize the mean imputation to remove the missing values from the data set, then applied principal component analysis to extract the optimal features from the data set to enhance the performance of the classifiers. After PCA, the reduced features are partitioned into training dataset and testing dataset where 70% of the training dataset are given as an input to the four well-liked classifiers as support vector machine, k-nearest neighbor, logistic regression and decision tree to train the classifiers and 30% of test dataset is used to evaluate an output of machine learning model using performance metrics as confusion matrix, classifier accuracy, precision, sensitivity, F1-score, AUC-ROC curve. Results: Output of the classifiers are evaluated using performance measures and we observed that logistic regression provides high accuracy than K-NN, SVM, decision tree classifiers and PCA performs sound as a good feature extraction method to enhance the performance of proposed model. From these analyses, we conclude that logistic regression having good mean accuracy level and standard deviation accuracy compared with the other three algorithms. AUC-ROC curve of the proposed classifiers is analyzed from the output figure.4, figure.5 that logistic regression exhibits good AUC-ROC score, i.e. around 70% compared to k-NN and decision tree algorithm. Conclusion: From the result analysis, we infer that this proposed machine learning model will act as an optimal decision making system to predict the acute myocardial infarction at an early stage than an existing machine learning based prediction models and it is capable to predict the presence of an acute myocardial Infarction with human using the heart disease risk factors, in order to decide when to start lifestyle modification and medical treatment to prevent the heart disease.


Author(s):  
Dhaval Patel ◽  
Shrey Shrivastava ◽  
Wesley Gifford ◽  
Stuart Siegel ◽  
Jayant Kalagnanam ◽  
...  

Sign in / Sign up

Export Citation Format

Share Document