scholarly journals Multivariate Bernoulli Logit-Normal Model for Failure Prediction

2019 ◽  
Vol 11 (1) ◽  
Author(s):  
Huijuan Shao ◽  
Xinwei Deng ◽  
Chi Zhang ◽  
Shuai Zheng ◽  
Hamed Khorasgani ◽  
...  

The failures among connected devices that are geographically close may have correlations and even propagate from one to another. However, there is little research to model this prob- lem due to the lacking of insights of the correlations in such devices. Most existing methods build one model for one de- vice independently so that they are not capable of captur- ing the underlying correlations, which can be important in- formation to leverage for failure prediction. To address this problem, we propose a multivariate Bernoulli Logit-Normal model (MBLN) to explicitly model the correlations of devices and predict failure probabilities of multiple devices simulta- neously. The proposed method is applied to a water tank data set where tanks are connected in a local area. The results indicate that our proposed method outperforms baseline ap- proaches in terms of the prediction performance such as ROC.

2018 ◽  
Vol 3 (2) ◽  
pp. 34-48
Author(s):  
Eduardo E Oliveira ◽  
Vera L. Miguéis ◽  
Luís Guimarães ◽  
José Borges

This paper describes a study on applying data mining techniques to power transformer failure prediction. The data set used consisted not only on DGA tests, but also in other tests done to the transformer’s insulating oil. This dataset presented several challenges, such as highly imbalanced classes (common in failure prediction problems), and the temporal nature of the observations.To overcome these challenges, several techniques were applied for prediction and better understand the dataset. Pre-processing and temporality incorporation in the dataset is discussed. For prediction, a 1-class and 2-class SVM, decision trees and random forests, as well as a LSTM neural network were applied to the dataset.As the prediction performance was low (high false-positive rate), we conducted a test to ascertain if the amount of data collected was sufficient. Results indicate that the frequency of data collection was not adequate, hinting that the degradation period was shorter than the periodicity of data collection.


2021 ◽  
Vol 14 (1) ◽  
Author(s):  
Martine De Cock ◽  
Rafael Dowsley ◽  
Anderson C. A. Nascimento ◽  
Davis Railsback ◽  
Jianwei Shen ◽  
...  

Abstract Background In biomedical applications, valuable data is often split between owners who cannot openly share the data because of privacy regulations and concerns. Training machine learning models on the joint data without violating privacy is a major technology challenge that can be addressed by combining techniques from machine learning and cryptography. When collaboratively training machine learning models with the cryptographic technique named secure multi-party computation, the price paid for keeping the data of the owners private is an increase in computational cost and runtime. A careful choice of machine learning techniques, algorithmic and implementation optimizations are a necessity to enable practical secure machine learning over distributed data sets. Such optimizations can be tailored to the kind of data and Machine Learning problem at hand. Methods Our setup involves secure two-party computation protocols, along with a trusted initializer that distributes correlated randomness to the two computing parties. We use a gradient descent based algorithm for training a logistic regression like model with a clipped ReLu activation function, and we break down the algorithm into corresponding cryptographic protocols. Our main contributions are a new protocol for computing the activation function that requires neither secure comparison protocols nor Yao’s garbled circuits, and a series of cryptographic engineering optimizations to improve the performance. Results For our largest gene expression data set, we train a model that requires over 7 billion secure multiplications; the training completes in about 26.90 s in a local area network. The implementation in this work is a further optimized version of the implementation with which we won first place in Track 4 of the iDASH 2019 secure genome analysis competition. Conclusions In this paper, we present a secure logistic regression training protocol and its implementation, with a new subprotocol to securely compute the activation function. To the best of our knowledge, we present the fastest existing secure multi-party computation implementation for training logistic regression models on high dimensional genome data distributed across a local area network.


Author(s):  
J. Behmann ◽  
P. Schmitter ◽  
J. Steinrücken ◽  
L. Plümer

Detection of crop stress from hyperspectral images is of high importance for breeding and precision crop protection. However, the continuous monitoring of stress in phenotyping facilities by hyperspectral imagers produces huge amounts of uninterpreted data. In order to derive a stress description from the images, interpreting algorithms with high prediction performance are required. Based on a static model, the local stress state of each pixel has to be predicted. Due to the low computational complexity, linear models are preferable. <br><br> In this paper, we focus on drought-induced stress which is represented by discrete stages of ordinal order. We present and compare five methods which are able to derive stress levels from hyperspectral images: One-vs.-one Support Vector Machine (SVM), one-vs.-all SVM, Support Vector Regression (SVR), Support Vector Ordinal Regression (SVORIM) and Linear Ordinal SVM classification. The methods are applied on two data sets - a real world set of drought stress in single barley plants and a simulated data set. It is shown, that Linear Ordinal SVM is a powerful tool for applications which require high prediction performance under limited resources. It is significantly more efficient than the one-vs.-one SVM and even more efficient than the less accurate one-vs.-all SVM. Compared to the very compact SVORIM model, it represents the senescence process much more accurate.


2014 ◽  
Vol 33 (2) ◽  
pp. 177-192 ◽  
Author(s):  
Shereen Hussein ◽  
Jill Manthorpe ◽  
Mohamed Ismail

Purpose – The aim of this paper is to explore the effect of ethnicity and separate this from the other dynamics associated with migration among members of the long-term care workforce in England focusing on the nature and structure of their jobs. The analysis examines interactions between ethnicity, gender, and age, and their relations with “meso” factors related to job and organizational characteristics and “macro” level factors related to local area characteristics. Design/methodology/approach – The paper analyses new national workforce data, the National Minimum Data Set for Social Care (NMDS-SC), n=357,869. The paper employs descriptive statistical analysis and a set of logistic regression models. Findings – The results indicate that labour participation of British black and minority ethnic (BME) groups in long-term care work is much lower than previously believed. There are variations in nature of work and possibly job security by ethnicity. Research limitations/implications – While the national sample is large, the data were not purposively collected to examine differentials in reasons to work in the care sector by different ethnicity. Practical implications – The analysis highlights the potential to actively promote social care work among British BME groups to meet workforce shortages, especially at a time where immigration policies are restricting the recruitment of non-European Economic Area nationals. Originality/value – The analysis provides a unique insight into the participation of British BME workers in the long-term care sector, separate from that of migrant workers.


2021 ◽  
Author(s):  
Marta Ferreira ◽  
Pierre Lovinfosse ◽  
Johanne Hermesse ◽  
Marjolein Decuypere ◽  
Caroline Rousseau ◽  
...  

Abstract Background Features reproducibility and the generalizability of the models are currently among the most important limitations when integrating radiomics into the clinics. Radiomic features are sensitive to imaging acquisition protocols, reconstruction algorithms and parameters, as well as by the different steps of the usual radiomics workflow. We propose a framework for comparing the reproducibility of different pre-processing steps in PET/CT radiomic analysis in the prediction of disease free survival (DFS) across multi-scanners/centers. Results We evaluated and compared the prediction performance of several models that differ in i) the type of intensity discretization, ii) feature selection method, iii) features type i.e, original or tumour to liver ratio radiomic features (OR or TLR). We trained our models using data from one scanner/center and tested on two external scanner/centers. Our results show that there is a low reproducibility in predictions across scanners and discretization methods. Despite of this, TLR based models were generally more robust than OR. Maximum relevance minimum redundancy (MRMR) forward feature selection with Pearson correlation was the feature selection method that had the best mean area under the precision recall curve when using it combining the features from all discretization’s bin’s number (D_All_FBN) with TLR features for two of the four classifiers. Conclusion We evaluated and compared the prediction performance of several models in a data set containing hundred fifty-eight patients with locally advanced cervical cancer (LACC) from three distinct scanners. In our cohort of LAAC patients pre-processing of radiomic features in [18F]FDG PET affects DFS predictions performances across scanners and combining the D_All_FBN TLR approach with the MRMR forward Pearson feature selection method might help increasing robustness of radiomic studies.


2018 ◽  
Vol 30 (2) ◽  
pp. 238-247 ◽  
Author(s):  
Yuya Nishida ◽  
Takashi Sonoda ◽  
Shinsuke Yasukawa ◽  
Kazunori Nagano ◽  
Mamoru Minami ◽  
...  

A hovering-type autonomous underwater vehicle (AUV) capable of cruising at low altitudes and observing the seafloor using only mounted sensors and payloads was developed for sea-creature survey. The AUV has a local area network (LAN) interface for an additional payload that can acquire navigation data from the AUV and transmit the target value to the AUV. In the handling process of the state flow of an AUV, additional payloads can control the AUV position using the transmitted target value without checking the AUV condition. In the handling process of the state flow of an AUV, additional payloads can control the AUV position using the transmitted target value without checking the AUV condition. In this research, water tank tests and sea trials were performed using an AUV equipped with a visual tracking system developed in other laboratories. The experimental results proved that additional payload can control the AUV position with a standard deviation of 0.1 m.


Author(s):  
Talasila Bhanuteja ◽  
◽  
Kilaru Venkata Narendra Kumar ◽  
Kolli Sai Poornachand ◽  
Chennupati Ashish ◽  
...  

The turn of events and misuse of a few noticeable Data mining strategies in various genuine application regions (for example Trade, Medical management and Natural science) has induced the usage of such methods in Machine Learning (ML) constrains, to distinct helpful snippets of information of the predefined information in medical services networks, biomedical fields and so forth The exact examination of clinical data set advantages in early illness expectation, patient consideration and local area administrations. The methodology of Machine Learning (ML) has been effectively utilized in grouped technologies including Disease forecast. The objective of generating classifier framework utilizing Machine Learning (ML) models is to massively assist with addressing the well-being related issues by helping the doctors to foresee and analyze illnesses at a beginning phase. Sample information of 4920 patient’s records determined to have 41 illnesses was chosen for examination. A reliant variable was made out of 41 sicknesses. 95 of 132 autonomous variables (symptoms) firmly identified with infections were chosen and advanced. This examination work completed shows the illness expectation framework created utilizing Machine learning calculations like Random Forest, Decision Tree Classifier and LightGBM. The paper confers the relative investigation of the consequences of the above-mentioned algorithms are utilized efficiently.


SPE Journal ◽  
2021 ◽  
pp. 1-15
Author(s):  
Basma Alharbi ◽  
Zhenwen Liang ◽  
Jana M. Aljindan ◽  
Ammar K. Agnia ◽  
Xiangliang Zhang

Summary Trusting a machine-learning model is a critical factor that will speed the spread of the fourth industrial revolution. Trust can be achieved by understanding how a model is making decisions. For white-box models, it is easy to “see” the model and examine its prediction. For black-box models, the explanation of the decision process is not straightforward. In this work, we compare the performance of several white- and black-box models on two production data sets in an anomaly detection task. The presence of anomalies in production data can significantly influence business decisions and misrepresent the results of the analysis, if not identified. Therefore, identifying anomalies is a crucial and necessary step to maintain safety and ensure that the wells perform at full capacity. To achieve this, we compare the performance of K-nearest neighbor (KNN), logistic regression (Logit), support vector machines (SVMs), decision tree (DT), random forest (RF), and rule fit classifier (RFC). F1 and complexity are the two main metrics used to compare the prediction performance and interpretability of these models. In one data set, RFC outperformed the remaining models in both F1 and complexity, where F1 = 0.92, and complexity = 0.5. In the second data set, RF outperformed the rest in prediction performance with F1 = 0.84, yet it had the lowest complexity metric (0.04). We further analyzed the best performing models by explaining their predictions using local interpretable model-agnostic explanations, which provide justification for decisions made for each instance. Additionally, we evaluated the global rules learned from white-box models. Local and global analysis enable decision makers to understand how and why models are making certain decisions, which in turn allows trusting the models.


2018 ◽  
Vol 938 ◽  
pp. 75-80
Author(s):  
A.I. Soldatov ◽  
A.A. Soldatov ◽  
P.V. Sorokin ◽  
M.A. Kostina ◽  
Y.V. Shulgina

The article presents the image of the testing area for the through-transition method with an acoustic array. Using linear acoustic array in through-transition method, a set of data from different angles is obtained. Using this data set and the back projection method, the test area imaging is obtained, which is represented by a set of small local areas. The number of initial projections passing through each local area is calculated. Furthermore, the density function is determined and the resulting function, that is encoded whether in color or grayscale, is displayed on the monitor screen.


Sign in / Sign up

Export Citation Format

Share Document