A Comparison of Machine Learning Processes for Classification of Rock Units Using Well Log Data

Machine-learning algorithms have been used by geoscientists to infer geologic and physical properties from hydrocarbon exploration and development wells for more than 40 years. These techniques historically utilize digital well-log information, which, like any remotely sensed measurement, have resolution limitations. Core is the only subsurface data that is true to geologic scale and heterogeneity. However, core description and analysis are time-intensive, and therefore most core data are not utilized to their full potential. Quadrant 204 on the United Kingdom Continental Shelf has publicly available open-source core and well log data. This study utilizes this dataset and machine-learning models to predict lithology and facies at the centimeter scale. We selected 12 wells from the Q204 region with well-log and core data from the Schiehallion, Foinaven, Loyal, and Alligin hydrocarbon fields. We interpreted training data from 659 m of core at the sub-centimeter scale, utilizing a lithology-based labeling scheme (five classes) and a depositional-process-based facies labeling scheme (six classes). Utilizing a “color-channel-log” (CCL) that summarizes the core image at each depth interval, our best performing trained model predicts the correct lithology with 69% accuracy (i.e., the predicted lithology output from the model is the same as the interpreted lithology) and predicts individual lithology classes of sandstone and mudstone with over 80% accuracy. The CCL data require less compute power than core image data and generate more accurate results. While the process-based facies labels better characterize turbidites and hybrid-event-bed stratigraphy, the machine-learning based predictions were not as accurate as compared to lithology. In all cases, the standard well-log data cannot accurately predict lithology or facies at the centimeter level. The machine-learning workflow developed for this study can unlock warehouses full of high-resolution data in a multitude of geological settings. The workflow can be applied to other geographic areas and deposit types where large quantities of photographed core material are available. This research establishes an open-source, python-based machine-learning workflow to analyze open-source core image data in a scalable, reproducible way. We anticipate that this study will serve as a baseline for future research and analysis of borehole and core data.

Download Full-text

DATA QUALITY CONSIDERATIONS FOR PETROPHYSICAL MACHINE LEARNING MODELS

10.30632/spwla-2021-0036 ◽

2021 ◽

Author(s):

Andrew McDonald ◽

Keyword(s):

Machine Learning ◽

Big Data ◽

Data Quality ◽

Input Data ◽

Well Log ◽

Learning Models ◽

Log Data ◽

Quality Issues ◽

Machine Learning Models

Decades of subsurface exploration and characterisation have led to the collation and storage of large volumes of well related data. The amount of data gathered daily continues to grow rapidly as technology and recording methods improve. With the increasing adoption of machine learning techniques in the subsurface domain, it is essential that the quality of the input data is carefully considered when working with these tools. If the input data is of poor quality, the impact on precision and accuracy of the prediction can be significant. Consequently, this can impact key decisions about the future of a well or a field. This study focuses on well log data, which can be highly multi-dimensional, diverse and stored in a variety of file formats. Well log data exhibits key characteristics of Big Data: Volume, Variety, Velocity, Veracity and Value. Well data can include numeric values, text values, waveform data, image arrays, maps, volumes, etc. All of which can be indexed by time or depth in a regular or irregular way. A significant portion of time can be spent gathering data and quality checking it prior to carrying out petrophysical interpretations and applying machine learning models. Well log data can be affected by numerous issues causing a degradation in data quality. These include missing data - ranging from single data points to entire curves; noisy data from tool related issues; borehole washout; processing issues; incorrect environmental corrections; and mislabelled data. Having vast quantities of data does not mean it can all be passed into a machine learning algorithm with the expectation that the resultant prediction is fit for purpose. It is essential that the most important and relevant data is passed into the model through appropriate feature selection techniques. Not only does this improve the quality of the prediction, it also reduces computational time and can provide a better understanding of how the models reach their conclusion. This paper reviews data quality issues typically faced by petrophysicists when working with well log data and deploying machine learning models. First, an overview of machine learning and Big Data is covered in relation to petrophysical applications. Secondly, data quality issues commonly faced with well log data are discussed. Thirdly, methods are suggested on how to deal with data issues prior to modelling. Finally, multiple case studies are discussed covering the impacts of data quality on predictive capability.

Download Full-text

Evaluation of Source Rock Potentiality and Prediction of Total Organic Carbon Using Well Log Data and Integrated Methods of Multivariate Analysis, Machine Learning, and Geochemical Analysis

Natural Resources Research ◽

10.1007/s11053-021-09988-1 ◽

2022 ◽

Author(s):

Edwin E. Nyakilla ◽

Selemani N. Silingi ◽

Chuanbo Shen ◽

Gu Jun ◽

Alvin K. Mulashani ◽

...

Keyword(s):

Machine Learning ◽

Multivariate Analysis ◽

Organic Carbon ◽

Total Organic Carbon ◽

Source Rock ◽

Well Log ◽

Geochemical Analysis ◽

Log Data ◽

Integrated Methods

Download Full-text

Generation of Synthetic Photoelectric Log using Machine Learning Approach

10.2118/208201-ms ◽

2021 ◽

Author(s):

Mohammad Rasheed Khan ◽

Zeeshan Tariq ◽

Mohamed Mahmoud

Keyword(s):

Machine Learning ◽

Mean Squared Error ◽

Coefficient Of Determination ◽

Percentage Error ◽

Well Log ◽

Log Data ◽

Hydrocarbon Reservoir ◽

Analysis Scheme ◽

Data Points ◽

Error Metric

Abstract Photoelectric factor (PEF) is one of functional parameters of a hydrocarbon reservoir that could provide invaluable data for reservoir characterization. Well logs are critical to formation evaluation processes; however, they are not always readily available due to unfeasible logging conditions. In addition, with call for efficiency in hydrocarbon E&P business, it has become imperative to optimize logging programs to acquire maximum data with minimal cost impact. As a result, the present study proposes an improved strategy for generating synthetic log by making a quantitative formulation between conventional well log data, rock mineralogical content and PEF. 230 data points were utilized to implement the machine learning (ML) methodology which is initiated by implementing a statistical analysis scheme. The input logs that are used for architecture establishment include the density and sonic logs. Moreover, rock mineralogical content (carbonate, quartz, clay) has been incorporated for model development which is strongly correlated to the PEF. At the next stage of this study, architecture of artificial neural networks (ANN) was developed and optimized to predict the PEF from conventional well log data. A sub-set of data points was used for ML model construction and another unseen set was employed to assess the model performance. Furthermore, a comprehensive error metrics analysis is used to evaluate performance of the proposed model. The synthetic PEF log generated using the developed ANN correlation is compared with the actual well log data available and demonstrate an average absolute percentage error less than 0.38. In addition, a comprehensive error metric analysis is presented which depicts coefficient of determination more than 0.99 and root mean squared error of only 0.003. The numerical analysis of the error metric point towards the robustness of the ANN model and capability to link mineralogical content with the PEF.

Download Full-text

NOVEL METHODOLOGY FOR AUTOMATION OF BAD WELL LOG DATA IDENTIFICATION AND REPAIR

10.30632/spwla-2021-0070 ◽

2021 ◽

Author(s):

Ryan Banas ◽

◽

Andrew McDonald ◽

Tegwyn Perkins ◽

◽

...

Keyword(s):

Machine Learning ◽

Linear Regression ◽

Multiple Linear Regression ◽

Unsupervised Learning ◽

Learning Algorithms ◽

Well Logs ◽

Quality Data ◽

Well Log ◽

Log Data ◽

Petrophysical Analysis

Subsurface analysis-driven field development requires quality data as input into analysis, modelling, and planning. In the case of many conventional reservoirs, pay intervals are often well consolidated and maintain integrity under drilling and geological stresses providing an ideal logging environment. Consequently, editing well logs is often overlooked or dismissed entirely. Petrophysical analysis however is not always constrained to conventional pay intervals. When developing an unconventional reservoir, pay sections may be comprised of shales. The requirement for edited and quality checked logs becomes crucial to accurately assess storage volumes in place. Edited curves can also serve as inputs to engineering studies, geological and geophysical models, reservoir evaluation, and many machine learning models employed today. As an example, hydraulic fracturing model inputs may span over adjacent shale beds around a target reservoir, which are frequently washed out. These washed out sections may seriously impact logging measurements of interest, such as bulk density and acoustic compressional slowness, which are used to generate elastic properties and compute geomechanical curves. Two classifications of machine learning algorithms for identifying outliers and poor-quality data due to bad hole conditions are discussed: supervised and unsupervised learning. The first allows the expert to train a model from existing and categorized data, whereas unsupervised learning algorithms learn from a collection of unlabeled data. Each classification type has distinct advantages and disadvantages. Identifying outliers and conditioning well logs prior to a petrophysical analysis or machine learning model can be a time-consuming and laborious process, especially when large multi-well datasets are considered. In this study, a new supervised learning algorithm is presented that utilizes multiple-linear regression analysis to repair well log data in an iterative and automated routine. This technique allows outliers to be identified and repaired whilst improving the efficiency of the log data editing process without compromising accuracy. The algorithm uses sophisticated logic and curve predictions derived via multiple linear regression in order to systematically repair various well logs. A clear improvement in efficiency is observed when the algorithm is compared to other currently used methods. These include manual processing by a petrophysicist and unsupervised outlier detection methods. The algorithm can also be leveraged over multiple wells to produce more generalized predictions. Through a platform created to quickly identify and repair invalid log data, the results are controlled through input and supervision by the user. This methodology is not a direct replacement of an expert interpreter, but complementary by allowing the petrophysicist to leverage computing power, improve consistency, reduce error and improve turnaround time.

Download Full-text