Machine Learning Microservice for Identification of Accident Predecessors

Abstract Drilling accidents prediction is the important task in well construction. Drilling support software allows observing the drilling parameters for multiple wells at the same time and artificial intelligence helps detecting the drilling accident predecessor ahead the emergency situation. We present machine learning (ML) algorithm for prediction of such accidents as stuck, mud loss, fluid show, washout, break of drill string and shale collar. The model for forecasting the drilling accidents is based on the "Bag-of-features" approach, which implies the use of distributions of the directly recorded data as the main features. Bag-of-features implies the labeling of small parts of data by the particular symbol, named codeword. Building histograms of symbols for the data segment, one could use the histogram as an input for the machine learning algorithm. Fragments of real-time mud log data were used to create the model. We define more than 1000 drilling accident predecessors for more than 60 real accidents and about 2500 normal drilling cases as a training set for ML model. The developed model analyzes real-time mud log data and calculates the probability of accident. The result is presented as a probability curve for each type of accident; if the critical probability value is exceeded, the user is notified of the risk of an accident. The Bag-of-features model shows high performance by validation both on historical data and in real time. The prediction quality does not vary field to field and could be used in different fields without additional training of the ML model. The software utilizing the ML model has microservice architecture and is integrated with the WITSML data server. It is capable of real-time accidents forecasting without human intervention. As a result, the system notifies the user in all cases when the situation in the well becomes similar to the pre-accident one, and the engineer has enough time to take the necessary actions to prevent an accident.

Download Full-text

An IoT-Focused Intrusion Detection System Approach Based on Preprocessing Characterization for Cybersecurity Datasets

Sensors ◽

10.3390/s21020656 ◽

2021 ◽

Vol 21 (2) ◽

pp. 656

Author(s):

Xavier Larriva-Novo ◽

Víctor A. Villagrá ◽

Mario Vega-Barbas ◽

Diego Rivera ◽

Mario Sanz Rodrigo

Keyword(s):

Machine Learning ◽

Intrusion Detection ◽

High Performance ◽

Learning Algorithm ◽

Detection System ◽

Machine Learning Algorithms ◽

Statistical Characteristics ◽

Detection Techniques ◽

Traffic Characteristics ◽

Benchmark Datasets

Security in IoT networks is currently mandatory, due to the high amount of data that has to be handled. These systems are vulnerable to several cybersecurity attacks, which are increasing in number and sophistication. Due to this reason, new intrusion detection techniques have to be developed, being as accurate as possible for these scenarios. Intrusion detection systems based on machine learning algorithms have already shown a high performance in terms of accuracy. This research proposes the study and evaluation of several preprocessing techniques based on traffic categorization for a machine learning neural network algorithm. This research uses for its evaluation two benchmark datasets, namely UGR16 and the UNSW-NB15, and one of the most used datasets, KDD99. The preprocessing techniques were evaluated in accordance with scalar and normalization functions. All of these preprocessing models were applied through different sets of characteristics based on a categorization composed by four groups of features: basic connection features, content characteristics, statistical characteristics and finally, a group which is composed by traffic-based features and connection direction-based traffic characteristics. The objective of this research is to evaluate this categorization by using various data preprocessing techniques to obtain the most accurate model. Our proposal shows that, by applying the categorization of network traffic and several preprocessing techniques, the accuracy can be enhanced by up to 45%. The preprocessing of a specific group of characteristics allows for greater accuracy, allowing the machine learning algorithm to correctly classify these parameters related to possible attacks.

Download Full-text

Low-cost real-time non-intrusive appliance identification and controlling through machine learning algorithm

2018 International Symposium on Consumer Technologies (ISCT) ◽

10.1109/isce.2018.8408911 ◽

2018 ◽

Cited By ~ 3

Author(s):

Sheharyar Khan ◽

Ahmad Farhan Latif ◽

Sarmad Sohaib

Keyword(s):

Machine Learning ◽

Real Time ◽

Learning Algorithm ◽

Low Cost ◽

Machine Learning Algorithm

Download Full-text

A Novel Machine Learning Sepsis Prediction Algorithm for Intended ICU Use (NAVOY Sepsis): A Proof-of-Concept Study (Preprint)

10.2196/preprints.28000 ◽

2021 ◽

Author(s):

Inger Persson ◽

Andreas Östling ◽

Martin Arlbrandt ◽

Joakim Söderberg ◽

David Becedas

Keyword(s):

Machine Learning ◽

High Performance ◽

Learning Algorithm ◽

Scoring Systems ◽

High Accuracy ◽

Prediction Algorithm ◽

Massachusetts Institute Of Technology ◽

Operating Characteristics ◽

Mortality And Morbidity ◽

Institute Of Technology

BACKGROUND Despite decades of research, sepsis remains a leading cause of mortality and morbidity in ICUs worldwide. The key to effective management and patient outcome is early detection, where no prospectively validated machine learning prediction algorithm is available for clinical use in Europe today. OBJECTIVE To develop a high-performance machine learning sepsis prediction algorithm based on routinely collected ICU data, designed to be implemented in Europe. METHODS The machine learning algorithm is developed using Convolutional Neural Network, based on the Massachusetts Institute of Technology Lab for Computational Physiology MIMIC-III Clinical Database, focusing on ICU patients aged 18 years or older. Twenty variables are used for prediction, on an hourly basis. Onset of sepsis is defined in accordance with the international Sepsis-3 criteria. RESULTS The developed algorithm NAVOY Sepsis uses 4 hours of input and can with high accuracy predict patients with high risk of developing sepsis in the coming hours. The prediction performance is superior to that of existing sepsis early warning scoring systems, and competes well with previously published prediction algorithms designed to predict sepsis onset in accordance with the Sepsis-3 criteria, as measured by the area under the receiver operating characteristics curve (AUROC) and the area under the precision-recall curve (AUPRC). NAVOY Sepsis yields AUROC = 0.90 and AUPRC = 0.62 for predictions up to 3 hours before sepsis onset. The predictive performance is externally validated on hold-out test data, where NAVOY Sepsis is confirmed to predict sepsis with high accuracy. CONCLUSIONS An algorithm with excellent predictive properties has been developed, based on variables routinely collected at ICUs. This algorithm is to be further validated in an ongoing prospective randomized clinical trial and will be CE marked as Software as a Medical Device, designed for commercial use in European ICUs.

Download Full-text

Using AI at the Edge and Incremental Machine Learning to Process Onboard Instrument Data

10.5957/smc-2021-048 ◽

2021 ◽

Author(s):

Nicholas Parkyn

Keyword(s):

Machine Learning ◽

Neural Networks ◽

Real Time ◽

Data Storage ◽

High Performance ◽

Heterogeneous Computing ◽

Low Cost ◽

Machine Intelligence ◽

Edge Computing ◽

Continual Learning

Emerging heterogeneous computing, computing at the edge, machine learning and AI at the edge technology drives approaches and techniques for processing and analysing onboard instrument data in near real-time. The author has used edge computing and neural networks combined with high performance heterogeneous computing platforms to accelerate AI workloads. Heterogeneous computing hardware used is readily available, low cost, delivers impressive AI performance and can run multiple neural networks in parallel. Collecting, processing and machine learning from onboard instruments data in near real-time is not a trivial problem due to data volumes, complexities of data filtering, data storage and continual learning. Little research has been done on continual machine learning which aims at a higher level of machine intelligence through providing the artificial agents with the ability to learn from a non-stationary and never-ending stream of data. The author has applied the concept of continual learning to building a system that continually learns from actual boat performance and refines predictions previously done using static VPP data. The neural networks used are initially trained using the output from traditional VPP software and continue to learn from actual data collected under real sailing conditions. The author will present the system design, AI, and edge computing techniques used and the approaches he has researched for incremental training to realise continual learning.

Download Full-text

Using Supervised Machine Learning Algorithms for Automated Lithology Prediction from Wireline Log Data

10.2118/208559-ms ◽

2021 ◽

Author(s):

Marian Popescu ◽

Rebecca Head ◽

Tim Ferriday ◽

Kate Evans ◽

Jose Montero ◽

...

Keyword(s):

Machine Learning ◽

Learning Algorithm ◽

Machine Learning Algorithms ◽

Supervised Machine Learning ◽

Training Dataset ◽

Depth Interval ◽

Log Data ◽

Machine Learning Approach ◽

Lithology Prediction ◽

Logging While Drilling

Abstract This paper presents advancements in machine learning and cloud deployment that enable rapid and accurate automated lithology interpretation. A supervised machine learning technique is described that enables rapid, consistent, and accurate lithology prediction alongside quantitative uncertainty from large wireline or logging-while-drilling (LWD) datasets. To leverage supervised machine learning, a team of geoscientists and petrophysicists made detailed lithology interpretations of wells to generate a comprehensive training dataset. Lithology interpretations were based on applying determinist cross-plotting by utilizing and combining various raw logs. This training dataset was used to develop a model and test a machine learning pipeline. The pipeline was applied to a dataset previously unseen by the algorithm, to predict lithology. A quality checking process was performed by a petrophysicist to validate new predictions delivered by the pipeline against human interpretations. Confidence in the interpretations was assessed in two ways. The prior probability was calculated, a measure of confidence in the input data being recognized by the model. Posterior probability was calculated, which quantifies the likelihood that a specified depth interval comprises a given lithology. The supervised machine learning algorithm ensured that the wells were interpreted consistently by removing interpreter biases and inconsistencies. The scalability of cloud computing enabled a large log dataset to be interpreted rapidly; >100 wells were interpreted consistently in five minutes, yielding >70% lithological match to the human petrophysical interpretation. Supervised machine learning methods have strong potential for classifying lithology from log data because: 1) they can automatically define complex, non-parametric, multi-variate relationships across several input logs; and 2) they allow classifications to be quantified confidently. Furthermore, this approach captured the knowledge and nuances of an interpreter's decisions by training the algorithm using human-interpreted labels. In the hydrocarbon industry, the quantity of generated data is predicted to increase by >300% between 2018 and 2023 (IDC, Worldwide Global DataSphere Forecast, 2019–2023). Additionally, the industry holds vast legacy data. This supervised machine learning approach can unlock the potential of some of these datasets by providing consistent lithology interpretations rapidly, allowing resources to be used more effectively.

Download Full-text

Machine Learning to Design an Auto-tuning System for the Best Compressed Format Detection for Parallel Sparse Computations

Parallel Processing Letters ◽

10.1142/s0129626421500195 ◽

2021 ◽

Author(s):

Olfa Hamdi-Larbi ◽

Ichrak Mehrez ◽

Thomas Dufaud

Keyword(s):

Machine Learning ◽

Numerical Method ◽

High Performance ◽

Programming Model ◽

Learning Algorithm ◽

Sparse Matrix ◽

Sparse Matrices ◽

Matrix Compression ◽

Target Architecture ◽

Parallel Programming Model

Many applications in scientific computing process very large sparse matrices on parallel architectures. The presented work in this paper is a part of a project where our general aim is to develop an auto-tuner system for the selection of the best matrix compression format in the context of high-performance computing. The target smart system can automatically select the best compression format for a given sparse matrix, a numerical method processing this matrix, a parallel programming model and a target architecture. Hence, this paper describes the design and implementation of the proposed concept. We consider a case study consisting of a numerical method reduced to the sparse matrix vector product (SpMV), some compression formats, the data parallel as a programming model and, a distributed multi-core platform as a target architecture. This study allows extracting a set of important novel metrics and parameters which are relative to the considered programming model. Our metrics are used as input to a machine-learning algorithm to predict the best matrix compression format. An experimental study targeting a distributed multi-core platform and processing random and real-world matrices shows that our system can improve in average up to 7% the accuracy of the machine learning.

Download Full-text

A high performance machine learning algorithm TspINA; scheduling multifariousness destined tasks by better efficiency

2020 Fourth World Conference on Smart Trends in Systems, Security and Sustainability (WorldS4) ◽

10.1109/worlds450073.2020.9210370 ◽

2020 ◽

Author(s):

Shantanu A. Lohi ◽

Nandita Tiwari

Keyword(s):

Machine Learning ◽

High Performance ◽

Learning Algorithm ◽

Machine Learning Algorithm

Download Full-text

LabelSens: enabling real-time sensor data labelling at the point of collection using an artificial intelligence-based approach

Personal and Ubiquitous Computing ◽

10.1007/s00779-020-01427-x ◽

2020 ◽

Vol 24 (5) ◽

pp. 709-722

Author(s):

Kieran Woodward ◽

Eiman Kanjo ◽

Andreas Oikonomou ◽

Alan Chamberlain

Keyword(s):

Machine Learning ◽

Data Collection ◽

Real Time ◽

High Performance ◽

Poor Performance ◽

Limited Range ◽

Performance Comparison ◽

Mobile App ◽

Sensor Data ◽

New Techniques

Abstract In recent years, machine learning has developed rapidly, enabling the development of applications with high levels of recognition accuracy relating to the use of speech and images. However, other types of data to which these models can be applied have not yet been explored as thoroughly. Labelling is an indispensable stage of data pre-processing that can be particularly challenging, especially when applied to single or multi-model real-time sensor data collection approaches. Currently, real-time sensor data labelling is an unwieldy process, with a limited range of tools available and poor performance characteristics, which can lead to the performance of the machine learning models being compromised. In this paper, we introduce new techniques for labelling at the point of collection coupled with a pilot study and a systematic performance comparison of two popular types of deep neural networks running on five custom built devices and a comparative mobile app (68.5–89% accuracy within-device GRU model, 92.8% highest LSTM model accuracy). These devices are designed to enable real-time labelling with various buttons, slide potentiometer and force sensors. This exploratory work illustrates several key features that inform the design of data collection tools that can help researchers select and apply appropriate labelling techniques to their work. We also identify common bottlenecks in each architecture and provide field tested guidelines to assist in building adaptive, high-performance edge solutions.

Download Full-text

Machine learning algorithms can predict tail biting outbreaks in pigs using feeding behaviour records

10.1101/2021.05.11.443554 ◽

2021 ◽

Author(s):

Catherine Ollagnier ◽

Claudia Kasper ◽

Anna Wallenbeck ◽

Linda Keeling ◽

Siavash A Bigdeli

Keyword(s):

Machine Learning ◽

Random Forest ◽

Real Time ◽

Feeding Behaviour ◽

Learning Algorithm ◽

Learning Algorithms ◽

Machine Learning Algorithms ◽

Machine Learning Algorithm ◽

Tail Biting ◽

Testing Set

Tail biting is a detrimental behaviour that impacts the welfare and health of pigs. Early detection of tail biting precursor signs allows for preventive measures to be taken, thus avoiding the occurrence of the tail biting event. This study aimed to build a machine-learning algorithm for real time detection of upcoming tail biting outbreaks, using feeding behaviour data recorded by an electronic feeder. Prediction capacities of seven machine learning algorithms (e.g., random forest, neural networks) were evaluated from daily feeding data collected from 65 pens originating from 2 herds of grower-finisher pigs (25-100kg), in which 27 tail biting events occurred. Data were divided into training and testing data, either by randomly splitting data into 75% (training set) and 25% (testing set), or by randomly selecting pens to constitute the testing set. The random forest algorithm was able to predict 70% of the upcoming events with an accuracy of 94%, when predicting events in pens for which it had previous data. The detection of events for unknown pens was less sensitive, and the neural network model was able to detect 14% of the upcoming events with an accuracy of 63%. A machine-learning algorithm based on ongoing data collection should be considered for implementation into automatic feeder systems for real time prediction of tail biting events.

Download Full-text

A Machine Learning Approach to Predict Hypotensive Events in ICU Settings

10.1101/794768 ◽

2019 ◽

Author(s):

Mina Chookhachizadeh Moghadam ◽

Ehsan Masoumi ◽

Nader Bagherzadeh ◽

Davinder Ramsingh ◽

Guann-Pyng Li ◽

...

Keyword(s):

Machine Learning ◽

Real Time ◽

Evaluation Method ◽

Learning Algorithm ◽

Machine Learning Algorithms ◽

Physiological Status ◽

Time Prediction ◽

Evaluation Approach ◽

High Positive Predictive Value ◽

Data Points

AbstractPurposePredicting hypotension well in advance provides physicians with enough time to respond with proper therapeutic measures. However, the real-time prediction of hypotension with high positive predictive value (PPV) is a challenge due to the dynamic changes in patients’ physiological status under the drug administration which is limiting the amount of useful data available for the algorithm.MethodsTo mimic real-time monitoring, we developed a machine learning algorithm that uses most of the available data points from patients’ record to train and test the algorithm. The algorithm predicts hypotension up to 30 minutes in advance based on only 5 minutes of patient’s physiological history. A novel evaluation method is proposed to assess the algorithm performance as a function of time at every timestamp within 30 minutes prior to hypotension. This evaluation approach provides statistical tools to find the best possible prediction window.ResultsDuring 181,000 minutes of monitoring of about 400 patients, the algorithm demonstrated 94% accuracy, 85% sensitivity and 96% specificity in predicting hypotension within 30 minutes of the events. A high PPV of 81% obtained and the algorithm predicted 80% of the events 25 minutes prior to their onsets. It was shown that choosing a classification threshold that maximizes the F1 score during the training phase contributes to a high PPV and sensitivity.ConclusionThis study reveals the promising potential of the machine learning algorithms in real-time prediction of hypotensive events in ICU setting based on short-term physiological history.

Download Full-text