Transformer-Based Deep Learning Models for Well Log Processing and Quality Control by Modelling Global Dependence of the Complex Sequences

2021 ◽  
Author(s):  
Ashutosh Kumar

Abstract A single well from any mature field produces approximately 1.7 million Measurement While Drilling (MWD) data points. We either use cross-correlation and covariance measurement, or Long Short-Term Memory (LSTM) based Deep Learning algorithms to diagnose long sequences of extremely noisy data. LSTM's context size of 200 tokens barely accounts for the entire depth. Proposed work develops application of Transformer-based Deep Learning algorithm to diagnose and predict events in complex sequences of well-log data. Sequential models learn geological patterns and petrophysical trends to detect events across depths of well-log data. However, vanishing gradients, exploding gradients and the limits of convolutional filters, limit the diagnosis of ultra-deep wells in complex subsurface information. Vast number of operations required to detect events between two subsurface points at large separation limits them. Transformers-based Models (TbMs) rely on non-sequential modelling that uses self-attention to relate information from different positions in the sequence of well-log, allowing to create an end-to-end, non-sequential, parallel memory network. We use approximately 21 million data points from 21 wells of Volve for the experiment. LSTMs, in addition to auto-regression (AR), autoregressive moving average (ARMA) and autoregressive integrated moving average (ARIMA) conventionally models the events in the time-series well-logs. However, complex global dependencies to detect events in heterogeneous subsurface are challenging for these sequence models. In the presented work we begin with one meter depth of data from Volve, an oil-field in the North Sea, and then proceed up to 1000 meters. Initially LSTMs and ARIMA models were acceptable, as depth increased beyond a few 100 meters their diagnosis started underperforming and a new methodology was required. TbMs have already outperformed several models in large sequences modelling for natural language processing tasks, thus they are very promising to model well-log data with very large depth separation. We scale features and labels according to the maximum and minimum value present in the training dataset and then use the sliding window to get training and evaluation data pairs from well-logs. Additional subsurface features were able to encode some information in the conventional sequential models, but the result did not compare significantly with the TbMs. TbMs achieved Root Mean Square Error of 0.27 on scale of (0-1) while diagnosing the depth up to 5000 meters. This is the first paper to show successful application of Transformer-based deep learning models for well-log diagnosis. Presented model uses a self-attention mechanism to learn complex dependencies and non-linear events from the well-log data. Moreover, the experimental setting discussed in the paper will act as a generalized framework for data from ultra-deep wells and their extremely heterogeneous subsurface environment.

Geophysics ◽  
2020 ◽  
Vol 86 (1) ◽  
pp. M1-M15
Author(s):  
Zhaoqi Gao ◽  
Chuang Li ◽  
Bing Zhang ◽  
Xiudi Jiang ◽  
Zhibin Pan ◽  
...  

As a rock-physics parameter, density plays a crucial role in lithology interpretation, reservoir evaluation, and description. However, density can hardly be directly inverted from seismic data, especially for large-scale structures; thus, additional information is needed to build such a large-scale model. Usually, well-log data can be used to build a large-scale density model through extrapolation; however, this approach can only work well for simple cases and it loses effectiveness when the medium is laterally heterogeneous. We have adopted a deep-learning-based method to build a large-scale density model based on seismic and well-log data. The long short-term memory network is used to learn the relation between seismic data and large-scale density. Except for the data pairs directly obtained from well logs, many velocity and density models randomly generated based on the statistical distributions of well logs are also used to generate several pairs of seismic data and the corresponding large-scale density. This can greatly enlarge the size and diversity of the training data set and consequently leads to a significant improvement of the proposed method in dealing with a heterogeneous medium even though only a few well logs are available. Our method is applied to synthetic and field data examples to verify its performance and compare it with the well extrapolation method, and the results clearly display that the proposed method can work well even though only a few well logs are available. Especially in the field data example, the built large-scale density model of the proposed method is improved by 11.9666 dB and 0.6740, respectively, in peak signal-to-noise ratio and structural similarity compared with that of the well extrapolation method.


2020 ◽  
pp. 42-49
Author(s):  
admin admin ◽  
◽  
◽  
Monika Gupta

Internet of Things (IoT) based healthcare applications have grown exponentially over the past decade. With the increasing number of fatalities due to cardiovascular diseases (CVD), it is the need of the hour to detect any signs of cardiac abnormalities as early as possible. This calls for automation on the detection and classification of said cardiac abnormalities by physicians. The problem here is that, there is not enough data to train Deep Learning models to classify ECG signals accurately because of sensitive nature of data and the rarity of certain cases involved in CVDs. In this paper, we propose a framework which involves Generative Adversarial Networks (GAN) to create synthetic training data for the classes with less data points to improve the performance of Deep Learning models trained with the dataset. With data being input from sensors via cloud and this model to classify the ECG signals, we expect the framework to be functional, accurate and efficient.


2021 ◽  
Author(s):  
Andrew McDonald ◽  

Decades of subsurface exploration and characterisation have led to the collation and storage of large volumes of well related data. The amount of data gathered daily continues to grow rapidly as technology and recording methods improve. With the increasing adoption of machine learning techniques in the subsurface domain, it is essential that the quality of the input data is carefully considered when working with these tools. If the input data is of poor quality, the impact on precision and accuracy of the prediction can be significant. Consequently, this can impact key decisions about the future of a well or a field. This study focuses on well log data, which can be highly multi-dimensional, diverse and stored in a variety of file formats. Well log data exhibits key characteristics of Big Data: Volume, Variety, Velocity, Veracity and Value. Well data can include numeric values, text values, waveform data, image arrays, maps, volumes, etc. All of which can be indexed by time or depth in a regular or irregular way. A significant portion of time can be spent gathering data and quality checking it prior to carrying out petrophysical interpretations and applying machine learning models. Well log data can be affected by numerous issues causing a degradation in data quality. These include missing data - ranging from single data points to entire curves; noisy data from tool related issues; borehole washout; processing issues; incorrect environmental corrections; and mislabelled data. Having vast quantities of data does not mean it can all be passed into a machine learning algorithm with the expectation that the resultant prediction is fit for purpose. It is essential that the most important and relevant data is passed into the model through appropriate feature selection techniques. Not only does this improve the quality of the prediction, it also reduces computational time and can provide a better understanding of how the models reach their conclusion. This paper reviews data quality issues typically faced by petrophysicists when working with well log data and deploying machine learning models. First, an overview of machine learning and Big Data is covered in relation to petrophysical applications. Secondly, data quality issues commonly faced with well log data are discussed. Thirdly, methods are suggested on how to deal with data issues prior to modelling. Finally, multiple case studies are discussed covering the impacts of data quality on predictive capability.


2021 ◽  
Author(s):  
Mohammad Rasheed Khan ◽  
Zeeshan Tariq ◽  
Mohamed Mahmoud

Abstract Photoelectric factor (PEF) is one of functional parameters of a hydrocarbon reservoir that could provide invaluable data for reservoir characterization. Well logs are critical to formation evaluation processes; however, they are not always readily available due to unfeasible logging conditions. In addition, with call for efficiency in hydrocarbon E&P business, it has become imperative to optimize logging programs to acquire maximum data with minimal cost impact. As a result, the present study proposes an improved strategy for generating synthetic log by making a quantitative formulation between conventional well log data, rock mineralogical content and PEF. 230 data points were utilized to implement the machine learning (ML) methodology which is initiated by implementing a statistical analysis scheme. The input logs that are used for architecture establishment include the density and sonic logs. Moreover, rock mineralogical content (carbonate, quartz, clay) has been incorporated for model development which is strongly correlated to the PEF. At the next stage of this study, architecture of artificial neural networks (ANN) was developed and optimized to predict the PEF from conventional well log data. A sub-set of data points was used for ML model construction and another unseen set was employed to assess the model performance. Furthermore, a comprehensive error metrics analysis is used to evaluate performance of the proposed model. The synthetic PEF log generated using the developed ANN correlation is compared with the actual well log data available and demonstrate an average absolute percentage error less than 0.38. In addition, a comprehensive error metric analysis is presented which depicts coefficient of determination more than 0.99 and root mean squared error of only 0.003. The numerical analysis of the error metric point towards the robustness of the ANN model and capability to link mineralogical content with the PEF.


2021 ◽  
Author(s):  
Ryan Banas ◽  
◽  
Andrew McDonald ◽  
Tegwyn Perkins ◽  
◽  
...  

Subsurface analysis-driven field development requires quality data as input into analysis, modelling, and planning. In the case of many conventional reservoirs, pay intervals are often well consolidated and maintain integrity under drilling and geological stresses providing an ideal logging environment. Consequently, editing well logs is often overlooked or dismissed entirely. Petrophysical analysis however is not always constrained to conventional pay intervals. When developing an unconventional reservoir, pay sections may be comprised of shales. The requirement for edited and quality checked logs becomes crucial to accurately assess storage volumes in place. Edited curves can also serve as inputs to engineering studies, geological and geophysical models, reservoir evaluation, and many machine learning models employed today. As an example, hydraulic fracturing model inputs may span over adjacent shale beds around a target reservoir, which are frequently washed out. These washed out sections may seriously impact logging measurements of interest, such as bulk density and acoustic compressional slowness, which are used to generate elastic properties and compute geomechanical curves. Two classifications of machine learning algorithms for identifying outliers and poor-quality data due to bad hole conditions are discussed: supervised and unsupervised learning. The first allows the expert to train a model from existing and categorized data, whereas unsupervised learning algorithms learn from a collection of unlabeled data. Each classification type has distinct advantages and disadvantages. Identifying outliers and conditioning well logs prior to a petrophysical analysis or machine learning model can be a time-consuming and laborious process, especially when large multi-well datasets are considered. In this study, a new supervised learning algorithm is presented that utilizes multiple-linear regression analysis to repair well log data in an iterative and automated routine. This technique allows outliers to be identified and repaired whilst improving the efficiency of the log data editing process without compromising accuracy. The algorithm uses sophisticated logic and curve predictions derived via multiple linear regression in order to systematically repair various well logs. A clear improvement in efficiency is observed when the algorithm is compared to other currently used methods. These include manual processing by a petrophysicist and unsupervised outlier detection methods. The algorithm can also be leveraged over multiple wells to produce more generalized predictions. Through a platform created to quickly identify and repair invalid log data, the results are controlled through input and supervision by the user. This methodology is not a direct replacement of an expert interpreter, but complementary by allowing the petrophysicist to leverage computing power, improve consistency, reduce error and improve turnaround time.


Author(s):  
Minsoo Ji ◽  
Seoyoon Kwon ◽  
Gayoung Park ◽  
Baehyun Min ◽  
Xuan Huy Nguyen

2020 ◽  
Vol 2020 ◽  
pp. 1-10 ◽  
Author(s):  
Rong Liu ◽  
Yan Liu ◽  
Yonggang Yan ◽  
Jing-Yan Wang

Deep learning models, such as deep convolutional neural network and deep long-short term memory model, have achieved great successes in many pattern classification applications over shadow machine learning models with hand-crafted features. The main reason is the ability of deep learning models to automatically extract hierarchical features from massive data by multiple layers of neurons. However, in many other situations, existing deep learning models still cannot gain satisfying results due to the limitation of the inputs of models. The existing deep learning models only take the data instances of an input point but completely ignore the other data points in the dataset, which potentially provides critical insight for the classification of the given input. To overcome this gap, in this paper, we show that the neighboring data points besides the input data point itself can boost the deep learning model’s performance significantly and design a novel deep learning model which takes both the data instances of an input point and its neighbors’ classification responses as inputs. In addition, we develop an iterative algorithm which updates the neighbors of data points according to the deep representations output by the deep learning model and the parameters of the deep learning model alternately. The proposed algorithm, named “Iterative Deep Neighborhood (IDN),” shows its advantages over the state-of-the-art deep learning models over tasks of image classification, text sentiment analysis, property price trend prediction, etc.


Author(s):  
Ahmad Muraji Suranto ◽  
Aris Buntoro ◽  
Carolus Prasetyadi ◽  
Ricky Adi Wibowo

In modeling the hydraulic fracking program for unconventional reservoir shales, information about elasticity rock properties is needed, namely Young's Modulus and Poisson's ratio as the basis for determining the formation depth interval with high brittleness. The elastic rock properties (Young's Modulus and Poisson's ratio) are a geomechanical parameters used to identify rock brittleness using core data (static data) and well log data (dynamic data). A common problem is that the core data is not available as the most reliable data, so well log data is used. The principle of measuring elastic rock properties in the rock mechanics lab is very different from measurements with well logs, where measurements in the lab are in high stresses / strains, low strain rates, and usually drained, while measurements in well logging use the principle of measured downhole by high frequency sonic. vibrations in conditions of very low stresses / strains, High strain rate, and Always undrained. For this reason, it is necessary to convert dynamic to static elastic rock properties (Poisson's ratio and Young's modulus) using empirical equations. The conversion of elastic rock properties (well logs) from dynamic to static using the empirical calculation method shows a significant shift in the value of Young's Modulus and Poisson's ratio, namely a shift from the ductile zone dominance to the dominant brittle zone. The conversion results were validated with the rock mechanical test results from the analog outcrop cores (static) showing that the results were sufficiently correlated based on the distribution range.


Sign in / Sign up

Export Citation Format

Share Document