Enhancing Well, Reservoir and Facilities Management WRFM Opportunity Identification with Data Driven Techniques

Mapping Intimacies ◽

10.2118/205596-ms ◽

2021 ◽

Author(s):

Manu Ujjwal ◽

Gaurav Modi ◽

Srungeer Simha

Keyword(s):

Machine Learning ◽

Data Processing ◽

Data Gathering ◽

Data Driven ◽

Petroleum Engineering ◽

Integration Time ◽

Facilities Management ◽

Learning Approaches ◽

Opportunity Identification ◽

Ranked List

Abstract A key to successful Well, Reservoir and Facilities Management (WRFM) is to have an up-to-date opportunity funnel. In large mature fields, WRFM opportunity identification is heavily dependent on effective exploitation of measured & interpreted data. This paper presents a suite of data driven workflows, collectively called WRFM Opportunity Finder (WOF), that generates ranked list of opportunities across the WRFM opportunity spectrum. The WOF was developed for a mature waterflooded asset with over 500 active wells and over 30 years of production history. The first step included data collection and cleanup using python routines and its integration into an interactive visualization dashboard. The WOF used this data to generate ranked list of following opportunity types: (a) Bean-up/bean-down candidates (b) Watershut-off candidates (c) Add-perf candidates (d) PLT/ILT data gathering candidates, and (e) well stimulation candidates. The WOF algorithms, implemented using python, largely comprised of rule-based workflows with occasional use of machine learning in intermediate steps. In a large mature asset, field/reservoir/well reviews are typically conducted area by area or reservoir by reservoir and is therefore a slow process. It is challenging to have an updated holistic overview of opportunities across the field which can allow prioritization of optimal opportunities. Though the opportunity screening logic may be linked to clear physics-based rules, its maturation is often difficult as it requires processing and integration of large volumes of multi-disciplinary data through laborious manual review processes. The WOF addressed these issues by leveraging data processing algorithms that gathered data directly from databases and applied customized data processing routines. This led to reduction in data preparation and integration time by 90%. The WOF used workflows linked to petroleum engineering principles to arrive at ranked lists of opportunities with a potential to add 1-2% increment in oil production. The integrated visualization dashboard allowed quick and transparent validation of the identified opportunities and their ranking basis using a variety of independent checks. The results from WOF will inform a range of business delivery elements such as workover & data gathering plan, exception-based-surveillance and facilities debottlenecking plan. WOF exploits the best of both worlds - physics-based solutions and data driven techniques. It offers transparent logic which are scalable and replicable to a variety of settings and hence has an edge over pure machine learning approaches. The WOF accelerates identification of low capex/no-capex opportunities using existing data. It promotes maximization of returns on already made investments and hence lends resilience to business in the low oil price environment.

Download Full-text

Practical CO2—WAG Field Operational Designs Using Hybrid Numerical-Machine-Learning Approaches

Energies ◽

10.3390/en14041055 ◽

2021 ◽

Vol 14 (4) ◽

pp. 1055

Author(s):

Qian Sun ◽

William Ampomah ◽

Junyu You ◽

Martha Cather ◽

Robert Balch

Keyword(s):

Machine Learning ◽

Oil Recovery ◽

History Matching ◽

Optimization Problems ◽

Learning Technologies ◽

Petroleum Engineering ◽

Support Vector ◽

Learning Approaches ◽

Field Development ◽

Proxy Models

Machine-learning technologies have exhibited robust competences in solving many petroleum engineering problems. The accurate predictivity and fast computational speed enable a large volume of time-consuming engineering processes such as history-matching and field development optimization. The Southwest Regional Partnership on Carbon Sequestration (SWP) project desires rigorous history-matching and multi-objective optimization processes, which fits the superiorities of the machine-learning approaches. Although the machine-learning proxy models are trained and validated before imposing to solve practical problems, the error margin would essentially introduce uncertainties to the results. In this paper, a hybrid numerical machine-learning workflow solving various optimization problems is presented. By coupling the expert machine-learning proxies with a global optimizer, the workflow successfully solves the history-matching and CO2 water alternative gas (WAG) design problem with low computational overheads. The history-matching work considers the heterogeneities of multiphase relative characteristics, and the CO2-WAG injection design takes multiple techno-economic objective functions into accounts. This work trained an expert response surface, a support vector machine, and a multi-layer neural network as proxy models to effectively learn the high-dimensional nonlinear data structure. The proposed workflow suggests revisiting the high-fidelity numerical simulator for validation purposes. The experience gained from this work would provide valuable guiding insights to similar CO2 enhanced oil recovery (EOR) projects.

Download Full-text

Combining Machine Learning and Classic Drilling Theories to Improve Rate of Penetration Prediction

10.2118/202202-ms ◽

2021 ◽

Author(s):

Hongbao Zhang ◽

Baoping Lu ◽

Lulu Liao ◽

Hongzhi Bao ◽

Zhifa Wang ◽

...

Keyword(s):

Machine Learning ◽

Data Processing ◽

Cost Estimation ◽

Linear Process ◽

Parameters Optimization ◽

Data Driven ◽

Rock Properties ◽

Rate Of Penetration ◽

Modelling Method ◽

Drilling Parameters

Abstract Theoretically, rate of penetration (ROP) model is the basic to drilling parameters design, ROP improvement tools selection and drill time & cost estimation. Currently, ROP modelling is mainly conducted by two approaches: equation-based approach and machine learning approach, and machine learning performs better because of the capacity in high-dimensional and non-linear process modelling. However, in deep or deviated wells, the ROP prediction accuracy of machine learning is always unsatisfied mainly because the energy loss along the wellbore and drill string is non-negligible and it's difficult to consider the effect of wellbore geometry in machine learning models by pure data-driven methods. Therefore, it's necessary to develop robust ROP modelling method for different scenarios. In the paper, the performance of several equation-based methods and machine learning methods are evaluated by data from 82 wells, the technical features and applicable scopes of different methods are analysed. A new machine learning based ROP modelling method suitable for different well path types was proposed. Integrated data processing pipeline was designed to dealing with data noises, data missing, and discrete variables. ROP effecting factors were analysed, including mechanical parameters, hydraulic parameters, bit characteristics, rock properties, wellbore geometry, etc. Several new features were created by classic drilling theories, such as downhole weight on bit (DWOB), hydraulic impact force, formation heterogeneity index, etc. to improve the efficiency of learning from data. A random forest model was trained by cross validation and hyperparameters optimization methods. Field test results shows that the model could predict the ROP in different hole sections (vertical, deviated and horizontal) and different drilling modes (sliding and rotating drilling) and the average accuracy meets the requirement of well planning. A novel data processing and feature engineering workflow was designed according the characteristics of ROP modelling in different well path types. An integrated data-driven ROP modelling and optimization software was developed, including functions of mechanical specific energy analysis, bit wear analysis and predict, 2D & 3D ROP sensitivity analysis, offset wells benchmark, ROP prediction, drilling parameters constraints analysis, cost per meter prediction, etc. and providing quantitative evidences for drilling parameters optimization, drilling tools selection and well time estimation.

Download Full-text

Data-Driven Intelligent Application for Youtube Video Popularity Analysis using Machine Learning and Statistics

10.5121/csit.2021.110807 ◽

2021 ◽

Author(s):

Wenxi Gao ◽

Ishmael Rico ◽

Yu Sun

Keyword(s):

Machine Learning ◽

Data Processing ◽

Web Application ◽

Data Driven ◽

Accuracy Evaluation ◽

Video Content ◽

The Public ◽

Left Behind ◽

The World ◽

Web App

People now prefer to follow trends. Since the time is moving, people can only keep themselves from being left behind if they keep up with the pace of time. There are a lot of websites for people to explore the world, but websites for those who show the public something new are uncommon. This paper proposes an web application to help YouTuber with recommending trending video content because they sometimes have trouble in thinking of the video topic. Our method to solve the problem is basically in four steps: YouTube scraping, data processing, prediction by SVM and the webpage. Users input their thoughts on our web app and computer will scrap the trending page of YouTube and process the data to do prediction. We did some experiments by using different data, and got the accuracy evaluation of our method. The results show that our method is feasible so people can use it to get their own recommendation.

Download Full-text

Future OFDM-based Communication Systems Towards 6G and Beyond: Machine Learning Approaches

Green Intelligent Systems and Applications ◽

10.53623/gisa.v1i1.34 ◽

2021 ◽

Vol 1 (1) ◽

pp. 19-25

Author(s):

Filbert H. Juwono ◽

Regina Reine

Keyword(s):

Machine Learning ◽

Communication Network ◽

Data Processing ◽

Fading Channels ◽

Communication Systems ◽

Multipath Fading ◽

Spectrum Efficiency ◽

Learning Approaches ◽

Multipath Fading Channels ◽

Rate Transmission

The vision towards 6G and beyond communication systems demands higher rate transmission, massive amount of data processing, and low latency communication. Orthogonal Frequency Division Modulation (OFDM) has been adopted in the current 5G networks and has become one of the potential candidates for the future communication systems. Although OFDM offers many benefits including high spectrum efficiency and high robustness against the multipath fading channels, it has major challenges such as frequency offset and high Peak to Power Ratio (PAPR). In 5G communication network, there is a significant increase in the number of sensors and other low-power devices where users or devices may create large amount of connection and dynamic data processing. In order to deal with the increasingly complex communication network, Machine Learning (ML) has been increasingly utilised to create intelligent and more efficient communication network. This paper discusses challenges and the impacts of embedding ML in OFDM-based communication systems.

Download Full-text

A data-driven forecasting approach for newly launched seasonal products by leveraging machine-learning approaches

Annals of Operations Research ◽

10.1007/s10479-020-03666-w ◽

2020 ◽

Author(s):

Majd Kharfan ◽

Vicky Wing Kei Chan ◽

Tugba Firdolas Efendigil

Keyword(s):

Machine Learning ◽

Data Driven ◽

Learning Approaches ◽

Seasonal Products

Download Full-text

Data-Driven Machine Learning Approaches for Advanced Battery Modeling

Next-Generation Materials for Batteries ◽

10.1063/9780735421684_004 ◽

2021 ◽

pp. 1-18

Author(s):

Pritam Kumar Panda ◽

Nabil Khossossi ◽

Rajeev Ahuja

Keyword(s):

Machine Learning ◽

Data Driven ◽

Learning Approaches ◽

Battery Modeling

Download Full-text

Comparison of Empirical Mode Decomposition, Wavelets, and Different Machine Learning Approaches for Patient-Specific Seizure Detection Using Signal-Derived Empirical Dictionary Approach

Frontiers in Digital Health ◽

10.3389/fdgth.2021.738996 ◽

2021 ◽

Vol 3 ◽

Author(s):

Muhammad Kaleem ◽

Aziz Guergachi ◽

Sridhar Krishnan

Keyword(s):

Machine Learning ◽

Seizure Detection ◽

Reconstruction Error ◽

Data Driven ◽

Patient Specific ◽

Learning Approaches ◽

Mode Decomposition ◽

Eeg Data ◽

Classifier Training

Analysis of long-term multichannel EEG signals for automatic seizure detection is an active area of research that has seen application of methods from different domains of signal processing and machine learning. The majority of approaches developed in this context consist of extraction of hand-crafted features that are used to train a classifier for eventual seizure detection. Approaches that are data-driven, do not use hand-crafted features, and use small amounts of patients' historical EEG data for classifier training are few in number. The approach presented in this paper falls in the latter category, and is based on a signal-derived empirical dictionary approach, which utilizes empirical mode decomposition (EMD) and discrete wavelet transform (DWT) based dictionaries learned using a framework inspired by traditional methods of dictionary learning. Three features associated with traditional dictionary learning approaches, namely projection coefficients, coefficient vector and reconstruction error, are extracted from both EMD and DWT based dictionaries for automated seizure detection. This is the first time these features have been applied for automatic seizure detection using an empirical dictionary approach. Small amounts of patients' historical multi-channel EEG data are used for classifier training, and multiple classifiers are used for seizure detection using newer data. In addition, the seizure detection results are validated using 5-fold cross-validation to rule out any bias in the results. The CHB-MIT benchmark database containing long-term EEG recordings of pediatric patients is used for validation of the approach, and seizure detection performance comparable to the state-of-the-art is obtained. Seizure detection is performed using five classifiers, thereby allowing a comparison of the dictionary approaches, features extracted, and classifiers used. The best seizure detection performance is obtained using EMD based dictionary and reconstruction error feature and support vector machine classifier, with accuracy, sensitivity and specificity values of 88.2, 90.3, and 88.1%, respectively. Comparison is also made with other recent studies using the same database. The methodology presented in this paper is shown to be computationally efficient and robust for patient-specific automatic seizure detection. A data-driven methodology utilizing a small amount of patients' historical data is hence demonstrated as a practical solution for automatic seizure detection.

Download Full-text

Prediction of Fluid Viscosity in Multiphase Reservoir Oil System by Machine Learning

Geofluids ◽

10.1155/2021/3223530 ◽

2021 ◽

Vol 2021 ◽

pp. 1-9

Author(s):

Lihua Shao ◽

Ru Ji ◽

Shuyi Du ◽

Hongqing Song

Keyword(s):

Machine Learning ◽

Prediction Models ◽

Computing Time ◽

Oil Production ◽

Petroleum Engineering ◽

Molar Ratio ◽

Fluid Viscosity ◽

Learning Approaches ◽

Accuracy Comparison ◽

Comparison Results

It is important to realize rapid and accurate prediction of fluid viscosity in a multiphase reservoir oil system for improving oil production in petroleum engineering. This study proposed three viscosity prediction models based on machine learning approaches. The prediction accuracy comparison results show that the random forest (RF) model performs accurately in predicting the viscosity of each phase of the reservoir, with the lowest error percentage and highest R 2 values. And the RF model is tremendously fast in a computing time of 0.53 s. In addition, sensitivity analysis indicates that for a multiphase reservoir system, the viscosity of each phase of the reservoir is determined by different factors. Among them, the viscosity of oil is vital for oil production, which is mainly affected by the molar ratio of gas to oil (MR-GO).

Download Full-text

Machine learning models in electronic health records can outperform conventional survival models for predicting patient mortality in coronary artery disease

10.1101/256008 ◽

2018 ◽

Cited By ~ 1

Author(s):

Andrew J. Steele ◽

S. Aylin Cakiroglu ◽

Anoop D. Shah ◽

Spiros C. Denaxas ◽

Harry Hemingway ◽

...

Keyword(s):

Machine Learning ◽

Clinical Practice ◽

Electronic Health Records ◽

Random Forests ◽

Missing Values ◽

Elastic Net ◽

Data Driven ◽

Learning Approaches ◽

Health Records ◽

Cox Models

AbstractPrognostic modelling is important in clinical practice and epidemiology for patient management and research. Electronic health records (EHR) provide large quantities of data for such models, but conventional epidemiological approaches require significant researcher time to implement. Expert selection of variables, fine-tuning of variable transformations and interactions, and imputing missing values in datasets are time-consuming and could bias subsequent analysis, particularly given that missingness in EHR is both high, and may carry meaning.Using a cohort of over 80,000 patients from the CALIBER programme, we performed a systematic comparison of several machine-learning approaches in EHR. We used Cox models and random survival forests with and without imputation on 27 expert-selected variables to predict all-cause mortality. We also used Cox models, random forests and elastic net regression on an extended dataset with 586 variables to build prognostic models and identify novel prognostic factors without prior expert input.We observed that data-driven models used on an extended dataset can outperform conventional models for prognosis, without data preprocessing or imputing missing values, and with no need to scale or transform continuous data. An elastic net Cox regression based with 586 unimputed variables with continuous values discretised achieved a C-index of 0.801 (bootstrapped 95% CI 0.799 to 0.802), compared to 0.793 (0.791 to 0.794) for a traditional Cox model comprising 27 expert-selected variables with imputation for missing values.We also found that data-driven models allow identification of novel prognostic variables; that the absence of values for particular variables carries meaning, and can have significant implications for prognosis; and that variables often have a nonlinear association with mortality, which discretised Cox models and random forests can elucidate.This demonstrates that machine-learning approaches applied to raw EHR data can be used to build reliable models for use in research and clinical practice, and identify novel predictive variables and their effects to inform future research.

Download Full-text

Modeling Evapotranspiration Response to Climatic Forcings Using Data-Driven Techniques in Grassland Ecosystems

Advances in Meteorology ◽

10.1155/2018/1824317 ◽

2018 ◽

Vol 2018 ◽

pp. 1-18 ◽

Cited By ~ 5

Author(s):

Xianming Dou ◽

Yongguo Yang

Keyword(s):

Neural Network ◽

Machine Learning ◽

Water Vapor ◽

Data Driven ◽

Machine Learning Techniques ◽

Support Vector ◽

Group Method ◽

Generalized Regression Neural Network ◽

Learning Approaches ◽

Inference System

Remarkable progress has been made over the last decade toward characterizing the mechanisms that dominate the exchange of water vapor between the biosphere and the atmosphere. This is attributed partly to the considerable development of machine learning techniques that allow the scientific community to use these advanced tools for approximating the nonlinear processes affecting the variation of water vapor in terrestrial ecosystems. Three novel machine learning approaches, namely, group method of data handling, extreme learning machine (ELM), and adaptive neurofuzzy inference system (ANFIS), were developed to simulate and forecast the daily evapotranspiration (ET) at four different grassland sites based on the flux tower data using the eddy covariance method. These models were compared with the extensively utilized data-driven models, including artificial neural network, generalized regression neural network, and support vector machine (SVM). Moreover, the influences of internal functions on their corresponding models (SVM, ELM, and ANFIS) were investigated together. It was demonstrated that most developed models did good job of simulating and forecasting daily ET at the four sites. In addition to strengths of robustness and simplicity, the newly proposed methods achieved the estimates comparable to those of the conventional approaches and accordingly can be used as promising alternatives to traditional methods. It was further discovered that the generalization performance of the ELM, ANFIS, and SVM models strongly depended on their respective internal functions, especially for SVM.

Download Full-text