Detecting Pressure Anomalies While Drilling Using a Machine Learning Hybrid Approach

Abstract Abnormal surface pressure is typically the first indicator of a number of problematic events, including kicks, losses, washouts and stuck pipe. These events account for 60–70% of all drilling-related nonproductive time, so their early and accurate detection has the potential to save the industry billions of dollars. Detecting these events today requires an expert user watching multiple curves, which can be costly, and subject to human errors. The solution presented in this paper is aiming at augmenting traditional models with new machine learning techniques, which enable to detect these events automatically and help the monitoring of the drilling well. Today’s real-time monitoring systems employ complex physical models to estimate surface standpipe pressure while drilling. These require many inputs and are difficult to calibrate. Machine learning is an alternative method to predict pump pressure, but this alone needs significant labelled training data, which is often lacking in the drilling world. The new system combines these approaches: a machine learning framework is used to enable automated learning while the physical models work to compensate any gaps in the training data. The system uses only standard surface measurements, is fully automated, and is continuously retrained while drilling to ensure the most accurate pressure prediction. In addition, a stochastic (Bayesian) machine learning technique is used, which enables not only a prediction of the pressure, but also the uncertainty and confidence of this prediction. Last, the new system includes a data quality control workflow. It discards periods of low data quality for the pressure anomaly detection and enables to have a smarter real-time events analysis. The new system has been tested on historical wells using a new test and validation framework. The framework runs the system automatically on large volumes of both historical and simulated data, to enable cross-referencing the results with observations. In this paper, we show the results of the automated test framework as well as the capabilities of the new system in two specific case studies, one on land and another offshore. Moreover, large scale statistics enlighten the reliability and the efficiency of this new detection workflow. The new system builds on the trend in our industry to better capture and utilize digital data for optimizing drilling.

Download Full-text

Data Quality Measures and Efficient Evaluation Algorithms for Large-Scale High-Dimensional Data

Applied Sciences ◽

10.3390/app11020472 ◽

2021 ◽

Vol 11 (2) ◽

pp. 472

Author(s):

Hyeongmin Cho ◽

Sangkyun Lee

Keyword(s):

Machine Learning ◽

Data Quality ◽

Large Scale ◽

High Dimensional Data ◽

Quality Measures ◽

Training Data ◽

Measure Data ◽

High Dimensional ◽

Small Scale ◽

Class Separability

Machine learning has been proven to be effective in various application areas, such as object and speech recognition on mobile systems. Since a critical key to machine learning success is the availability of large training data, many datasets are being disclosed and published online. From a data consumer or manager point of view, measuring data quality is an important first step in the learning process. We need to determine which datasets to use, update, and maintain. However, not many practical ways to measure data quality are available today, especially when it comes to large-scale high-dimensional data, such as images and videos. This paper proposes two data quality measures that can compute class separability and in-class variability, the two important aspects of data quality, for a given dataset. Classical data quality measures tend to focus only on class separability; however, we suggest that in-class variability is another important data quality factor. We provide efficient algorithms to compute our quality measures based on random projections and bootstrapping with statistical benefits on large-scale high-dimensional data. In experiments, we show that our measures are compatible with classical measures on small-scale data and can be computed much more efficiently on large-scale high-dimensional datasets.

Download Full-text

Arabic tweets sentiment analysis – a hybrid scheme

Journal of Information Science ◽

10.1177/0165551515610513 ◽

2016 ◽

Vol 42 (6) ◽

pp. 782-797 ◽

Cited By ~ 42

Author(s):

Haifa K. Aldayel ◽

Aqil M. Azmi

Keyword(s):

Machine Learning ◽

Saudi Arabia ◽

Hybrid Approach ◽

Training Data ◽

Machine Learning Techniques ◽

Good Source ◽

Learning Classifier ◽

Learning Techniques ◽

Semantic Orientation ◽

F Measure

The fact that people freely express their opinions and ideas in no more than 140 characters makes Twitter one of the most prevalent social networking websites in the world. Being popular in Saudi Arabia, we believe that tweets are a good source to capture the public’s sentiment, especially since the country is in a fractious region. Going over the challenges and the difficulties that the Arabic tweets present – using Saudi Arabia as a basis – we propose our solution. A typical problem is the practice of tweeting in dialectical Arabic. Based on our observation we recommend a hybrid approach that combines semantic orientation and machine learning techniques. Through this approach, the lexical-based classifier will label the training data, a time-consuming task often prepared manually. The output of the lexical classifier will be used as training data for the SVM machine learning classifier. The experiments show that our hybrid approach improved the F-measure of the lexical classifier by 5.76% while the accuracy jumped by 16.41%, achieving an overall F-measure and accuracy of 84 and 84.01% respectively.

Download Full-text

Learning from the 2018 Western Japan Heavy Rains to Detect Floods during the 2019 Hagibis Typhoon

Remote Sensing ◽

10.3390/rs12142244 ◽

2020 ◽

Vol 12 (14) ◽

pp. 2244

Author(s):

Luis Moya ◽

Erick Mas ◽

Shunichi Koshimura

Keyword(s):

Machine Learning ◽

Real Time ◽

Local Governments ◽

Large Scale ◽

Damage Identification ◽

Remote Sensing Data ◽

Early Response ◽

Training Data ◽

Supervised Machine Learning ◽

A Current

Applications of machine learning on remote sensing data appear to be endless. Its use in damage identification for early response in the aftermath of a large-scale disaster has a specific issue. The collection of training data right after a disaster is costly, time-consuming, and many times impossible. This study analyzes a possible solution to the referred issue: the collection of training data from past disaster events to calibrate a discriminant function. Then the identification of affected areas in a current disaster can be performed in near real-time. The performance of a supervised machine learning classifier to learn from training data collected from the 2018 heavy rainfall at Okayama Prefecture, Japan, and to identify floods due to the typhoon Hagibis on 12 October 2019 at eastern Japan is reported in this paper. The results show a moderate agreement with flood maps provided by local governments and public institutions, and support the assumption that previous disaster information can be used to identify a current disaster in near-real time.

Download Full-text

Predicting On-axis Rotorcraft Dynamic Responses Using Machine Learning Techniques

Journal of the American Helicopter Society ◽

10.4050/jahs.65.032004 ◽

2020 ◽

Vol 65 (3) ◽

pp. 1-12

Author(s):

Ryan D. Jackson ◽

Michael Jump ◽

Peter L. Green

Keyword(s):

Machine Learning ◽

Real Time ◽

Computational Cost ◽

Response Term ◽

Dynamic Responses ◽

Training Data ◽

Machine Learning Techniques ◽

Human In The Loop ◽

Roll Rate ◽

Gp Model

Physical-law-based models are widely utilized in the aerospace industry. One such use is to provide flight dynamics models for use in flight simulators. For human-in-the-loop use, such simulators must run in real-time. Owing to the complex physics of rotorcraft flight, to meet this real-time requirement, simplifications to the underlying physics sometimes have to be applied to the model, leading to errors in the model's predictions of the real vehicle's response. This study investigated whether a machine-learning technique could be employed to provide rotorcraft dynamic response predictions. Machine learning was facilitated using a Gaussian process (GP) nonlinear autoregressive model, which predicted the on-axis pitch rate, roll rate, yaw rate, and heave responses of a Bo105 rotorcraft. A variational sparse GP model was then developed to reduce the computational cost of implementing the approach on large datasets. It was found that both of the GP models were able to provide accurate on-axis response predictions, particularly when the model input contained all four control inceptors and one lagged on-axis response term. The predictions made showed improvement compared to a corresponding physics-based model. The reduction of training data to one-third (rotational axes) or one-half (heave axis) resulted in only minor degradation of the sparse GP model predictions.

Download Full-text

A Digital Twin for Real-Time Drilling Hydraulics Simulation Using a Hybrid Approach of Physics and Machine Learning

10.4043/31278-ms ◽

2021 ◽

Author(s):

Prasanna Amur Varadarajan ◽

Ghislain Roguin ◽

Nick Abolins ◽

Maurice Ringer

Keyword(s):

Machine Learning ◽

Time Series ◽

Real Time ◽

Event Detection ◽

Hybrid Approach ◽

Time Series Forecasting ◽

Machine Learning Techniques ◽

Construction Operations ◽

Model Comparisons ◽

Learning Techniques

Abstract Abnormal hydraulic event detection is essential for offshore well construction operations. These operations require model comparisons and real-time measurements. For this task, physics-based models, which need frequent manual calibration do not accurately capture all the hydraulic trends. The paper presents a method to overcome existing limitations by combining physics-based models with machine learning techniques, which are suited for time series forecasting. This method ensures accurate and reliable predictions during the forecasting period and helps remove the need for frequent manual calibration of the hydraulic input parameters.

Download Full-text

Predicting On-Axis Rotorcraft Dynamic Responses Using Machine Learning Techniques

10.20944/preprints201907.0348.v1 ◽

2019 ◽

Author(s):

Ryan Jackson ◽

Michael Jump ◽

Peter Green

Keyword(s):

Machine Learning ◽

Dynamic Response ◽

Real Time ◽

Computational Cost ◽

Dynamic Responses ◽

Training Data ◽

Machine Learning Techniques ◽

Machine Learning Technique ◽

Learning Technique ◽

Gp Model

Physical-law based models are widely utilized in the aerospace industry. One such use is to provide flight dynamics models for use in flight simulators. For human-in-the-loop use, such simulators must run in real-time. Due to the complex physics of rotorcraft flight, to meet this real-time requirement, simplifications to the underlying physics sometimes have to be applied to the model, leading to model response errors in the predictions compared to the real vehicle. This study investigated whether a machine-learning technique could be employed to provide rotorcraft dynamic response predictions, with the ultimate aim of this model taking over when the physics-based model's accuracy degrades. In the current work, a machine-learning technique was employed to train a model to predict the dynamic response of a rotorcraft. Machine learning was facilitated using a Gaussian Process (GP) non-linear autoregressive model, which predicted the on-axis pitch rate, roll rate, yaw rate and heave responses of a Bo105 rotorcraft. A variational sparse GP model was then developed to reduce the computational cost of implementing the approach on large data sets. It was found that both of the GP models were able to provide accurate on-axis response predictions, particularly when the input contained all four control inceptors and one lagged on-axis response term. The predictions made showed improvement compared to a corresponding physics-based model. The reduction of training data to one-third (rotational axes) or one-half (heave axis) resulted in only minor degradation of the GP model predictions.

Download Full-text

Comparative Analysis of Machine Learning Techniques Using Predictive Modeling

Recent Advances in Computer Science and Communications ◽

10.2174/2666255813999200904164539 ◽

2020 ◽

Vol 13 ◽

Author(s):

Ritu Khandelwal ◽

Hemlata Goyal ◽

Rajveer Singh Shekhawat

Keyword(s):

Machine Learning ◽

Comparative Analysis ◽

Data Science ◽

Training Data ◽

Machine Learning Techniques ◽

Future Trends ◽

Data Set ◽

Learning Stage ◽

Learning Techniques ◽

Different Types

Introduction: Machine learning is an intelligent technology that works as a bridge between businesses and data science. With the involvement of data science, the business goal focuses on findings to get valuable insights on available data. The large part of Indian Cinema is Bollywood which is a multi-million dollar industry. This paper attempts to predict whether the upcoming Bollywood Movie would be Blockbuster, Superhit, Hit, Average or Flop. For this Machine Learning techniques (classification and prediction) will be applied. To make classifier or prediction model first step is the learning stage in which we need to give the training data set to train the model by applying some technique or algorithm and after that different rules are generated which helps to make a model and predict future trends in different types of organizations. Methods: All the techniques related to classification and Prediction such as Support Vector Machine(SVM), Random Forest, Decision Tree, Naïve Bayes, Logistic Regression, Adaboost, and KNN will be applied and try to find out efficient and effective results. All these functionalities can be applied with GUI Based workflows available with various categories such as data, Visualize, Model, and Evaluate. Result: To make classifier or prediction model first step is learning stage in which we need to give the training data set to train the model by applying some technique or algorithm and after that different rules are generated which helps to make a model and predict future trends in different types of organizations Conclusion: This paper focuses on Comparative Analysis that would be performed based on different parameters such as Accuracy, Confusion Matrix to identify the best possible model for predicting the movie Success. By using Advertisement Propaganda, they can plan for the best time to release the movie according to the predicted success rate to gain higher benefits. Discussion: Data Mining is the process of discovering different patterns from large data sets and from that various relationships are also discovered to solve various problems that come in business and helps to predict the forthcoming trends. This Prediction can help Production Houses for Advertisement Propaganda and also they can plan their costs and by assuring these factors they can make the movie more profitable.

Download Full-text

57 Precision neoantigen discovery using novel algorithms and expanded HLA-ligandome datasets

Journal for ImmunoTherapy of Cancer ◽

10.1136/jitc-2020-sitc2020.0057 ◽

2020 ◽

Vol 8 (Suppl 3) ◽

pp. A62-A62

Author(s):

Dattatreya Mellacheruvu ◽

Rachel Pyke ◽

Charles Abbott ◽

Nick Phillips ◽

Sejal Desai ◽

...

Keyword(s):

Machine Learning ◽

Cell Lines ◽

Antigen Processing ◽

Large Scale ◽

Prediction Models ◽

K562 Cells ◽

Machine Learning Algorithms ◽

Training Data ◽

High Quality ◽

Tissue Samples

BackgroundAccurately identified neoantigens can be effective therapeutic agents in both adjuvant and neoadjuvant settings. A key challenge for neoantigen discovery has been the availability of accurate prediction models for MHC peptide presentation. We have shown previously that our proprietary model based on (i) large-scale, in-house mono-allelic data, (ii) custom features that model antigen processing, and (iii) advanced machine learning algorithms has strong performance. We have extended upon our work by systematically integrating large quantities of high-quality, publicly available data, implementing new modelling algorithms, and rigorously testing our models. These extensions lead to substantial improvements in performance and generalizability. Our algorithm, named Systematic HLA Epitope Ranking Pan Algorithm (SHERPA™), is integrated into the ImmunoID NeXT Platform®, our immuno-genomics and transcriptomics platform specifically designed to enable the development of immunotherapies.MethodsIn-house immunopeptidomic data was generated using stably transfected HLA-null K562 cells lines that express a single HLA allele of interest, followed by immunoprecipitation using W6/32 antibody and LC-MS/MS. Public immunopeptidomics data was downloaded from repositories such as MassIVE and processed uniformly using in-house pipelines to generate peptide lists filtered at 1% false discovery rate. Other metrics (features) were either extracted from source data or generated internally by re-processing samples utilizing the ImmunoID NeXT Platform.ResultsWe have generated large-scale and high-quality immunopeptidomics data by using approximately 60 mono-allelic cell lines that unambiguously assign peptides to their presenting alleles to create our primary models. Briefly, our primary ‘binding’ algorithm models MHC-peptide binding using peptide and binding pockets while our primary ‘presentation’ model uses additional features to model antigen processing and presentation. Both primary models have significantly higher precision across all recall values in multiple test data sets, including mono-allelic cell lines and multi-allelic tissue samples. To further improve the performance of our model, we expanded the diversity of our training set using high-quality, publicly available mono-allelic immunopeptidomics data. Furthermore, multi-allelic data was integrated by resolving peptide-to-allele mappings using our primary models. We then trained a new model using the expanded training data and a new composite machine learning architecture. The resulting secondary model further improves performance and generalizability across several tissue samples.ConclusionsImproving technologies for neoantigen discovery is critical for many therapeutic applications, including personalized neoantigen vaccines, and neoantigen-based biomarkers for immunotherapies. Our new and improved algorithm (SHERPA) has significantly higher performance compared to a state-of-the-art public algorithm and furthers this objective.

Download Full-text

Near real-time twitter spam detection with machine learning techniques

International Journal of Computers and Applications ◽

10.1080/1206212x.2020.1751387 ◽

2020 ◽

pp. 1-11 ◽

Cited By ~ 1

Author(s):

Nan Sun ◽

Guanjun Lin ◽

Junyang Qiu ◽

Paul Rimba

Keyword(s):

Machine Learning ◽

Real Time ◽

Machine Learning Techniques ◽

Spam Detection ◽

Learning Techniques

Download Full-text

Application of Machine Learning for Lithology-on-Bit Prediction using Drilling Data in Real-Time

10.2118/206622-ms ◽

2021 ◽

Author(s):

Temirlan Zhekenov ◽

Artem Nechaev ◽

Kamilla Chettykbayeva ◽

Alexey Zinovyev ◽

German Sardarov ◽

...

Keyword(s):

Machine Learning ◽

Data Quality ◽

Real Time ◽

Hybrid Modeling ◽

Data Set ◽

Drilling Parameters ◽

Drilling Data ◽

Stable Solutions ◽

Mud Logging

SUMMARY Researchers base their analysis on basic drilling parameters obtained during mud logging and demonstrate impressive results. However, due to limitations imposed by data quality often present during drilling, those solutions often tend to lose their stability and high levels of predictivity. In this work, the concept of hybrid modeling was introduced which allows to integrate the analytical correlations with algorithms of machine learning for obtaining stable solutions consistent from one data set to another.

Download Full-text