Application of Various Machine Learning Techniques in Predicting Water Saturation in Tight Gas Sandstone Formation

2021 ◽  
pp. 1-14
Author(s):  
Ahmed Farid Ibrahim ◽  
Salaheldin Elkatatny ◽  
Yasmin Abdelraouf ◽  
Mustafa Al Ramadan

Abstract Water saturation (Sw) is a vital factor for the hydrocarbon in-place calculations. Sw is usually calculated using different equations; however, its values have been inconsistent with the experimental results due to often incorrectness of their underlying assumptions. Moreover, the main hindrance remains in these approaches due to their strong reliance on experimental analysis which are expensive and time-consuming. This study introduces the application of different machine learning (ML) methods to predict Sw from the conventional well logs. Function networks (FN), support vector machine (SVM), and random forests (RF) were implemented to calculate the Sw using gamma-ray (GR) log, Neutron porosity (NPHI) log, and resistivity (Rt) log. A dataset of 782 points from two wells (Well-1 and Well-2) in tight gas sandstone formation was used to build and then validate the different ML models. The data set from Well-1 was applied for the ML models training and testing, then the unseen data from well-2 was used to validate the developed models. The results from FN, SVM and RF models showed their capability of accurately predicting the Sw from the conventional well logging data. The correlation coefficient (R) values between actual and estimated Sw from the FN model were found to be 0.85 and 0.83 compared to 0.98, and 0.95 from the RF model in the case of training and testing sets, respectively. SVM model shows an R-value of 0.95 and 0.85 in the different datasets. The average absolute percentage error (AAPE) was less than 8% in the three ML models. The ML models outperform the empirical correlations that have AAPE greater than 19%. This study provides ML applications to accurately forecast the water saturation using the readily available conventional well logs without additional core analysis or well site interventions.

2021 ◽  
Vol 2021 ◽  
pp. 1-9
Author(s):  
Osama Siddig ◽  
Ahmed Farid Ibrahim ◽  
Salaheldin Elkatatny

Unconventional resources have recently gained a lot of attention, and as a consequence, there has been an increase in research interest in predicting total organic carbon (TOC) as a crucial quality indicator. TOC is commonly measured experimentally; however, due to sampling restrictions, obtaining continuous data on TOC is difficult. Therefore, different empirical correlations for TOC have been presented. However, there are concerns about the generalization and accuracy of these correlations. In this paper, different machine learning (ML) techniques were utilized to develop models that predict TOC from well logs, including formation resistivity (FR), spontaneous potential (SP), sonic transit time (Δt), bulk density (RHOB), neutron porosity (CNP), gamma ray (GR), and spectrum logs of thorium (Th), uranium (Ur), and potassium (K). Over 1250 data points from the Devonian Duvernay shale were utilized to create and validate the model. These datasets were obtained from three wells; the first was used to train the models, while the data sets from the other two wells were utilized to test and validate them. Support vector machine (SVM), random forest (RF), and decision tree (DT) were the ML approaches tested, and their predictions were contrasted with three empirical correlations. Various AI methods’ parameters were tested to assure the best possible accuracy in terms of correlation coefficient (R) and average absolute percentage error (AAPE) between the actual and predicted TOC. The three ML methods yielded good matches; however, the RF-based model has the best performance. The RF model was able to predict the TOC for the different datasets with R values range between 0.93 and 0.99 and AAPE values less than 14%. In terms of average error, the ML-based models outperformed the other three empirical correlations. This study shows the capability and robustness of ML models to predict the total organic carbon from readily available logging data without the need for core analysis or additional well interventions.


2019 ◽  
Vol 67 (6) ◽  
pp. 1991-2003 ◽  
Author(s):  
Edyta Puskarczyk

Abstract Unconventional oil and gas reservoirs from the lower Palaeozoic basin at the western slope of the East European Craton were taken into account in this study. The aim was to supply and improve standard well logs interpretation based on machine learning methods, especially ANNs. ANNs were used on standard well logging data, e.g. P-wave velocity, density, resistivity, neutron porosity, radioactivity and photoelectric factor. During the calculations, information about lithology or stratigraphy was not taken into account. We apply different methods of classification: cluster analysis, support vector machine and artificial neural network—Kohonen algorithm. We compare the results and analyse obtained electrofacies. Machine learning method–support vector machine SVM was used for classification. For the same data set, SVM algorithm application results were compared to the results of the Kohonen algorithm. The results were very similar. We obtained very good agreement of results. Kohonen algorithm (ANN) was used for pattern recognition and identification of electrofacies. Kohonen algorithm was also used for geological interpretation of well logs data. As a result of Kohonen algorithm application, groups corresponding to the gas-bearing intervals were found. Analysis showed diversification between gas-bearing formations and surrounding beds. It is also shown that internal diversification in gas-saturated beds is present. It is concluded that ANN appeared to be a useful and quick tool for preliminary classification of members and gas-saturated identification.


2021 ◽  
Vol 13 (1) ◽  
Author(s):  
Catherine Cheung ◽  
Calista Biondic ◽  
Zouhair Hamaimou ◽  
Julio Valdes

Rapid developments in sensor technology, data processing tools and data storage capability have helped fuel an increased appetite for equipment health monitoring in mechanical systems. As a result, the number of sensors and amount of data collected for health monitoring has grown tremendously. It is hoped that by collecting large quantities of operational data, predictive tools can be developed that will provide operational, maintenance and safety benefits. Data mining and machine learning techniques are important tools in addressing the ensuing challenge of extracting useful results from the data collected. In this work, the sensor data from a gas turbine system was analyzed with the objective of failure modeling and prediction. Previous efforts had used a two-class approach for this problem, to distinguish healthy and failed states of the system. In this work, a third class labelled as deteriorated data is added prior to each failure event to explore the ability of machine learning models to provide early warning of upcoming incidents. Several maintenance incidents were recorded by the sensor system in two separate vehicles. Three approaches to selecting training data were used. The first followed a traditional method of randomly selecting data points from all data according to a desired percentage of failed data to include in training, target ratios between failed and healthy data in each data set, as well as target ratios between training and testing data. The second data selection strategy was to consider data related to failure incidents as a whole and select certain incidents to include in training, and the remaining ones to be unseen in testing. The third approach was cross-validation which is typically used as a technique to evaluate how a classifier will perform on unseen data while still using the entirety of the data to train the final classifier. In addition to investigating training and data selection strategies, the effect of hyperparameter optimization was explored as well as the effect of varying the time period of the deteriorated class. Using the gas turbine data, which included 7 failure incidents and 76 predictor variables, a variety of classifier models of the system were developed in a three-class problem to differentiate healthy, deteriorated and failed system states. The classifier methods included support vector machines, Gaussian Naïve Bayes, random forest, adaboost, multilayer perceptron, k-nearest neighbor, and XG boost. Ensemble models were also created to leverage all the individual classifier models that were developed. This paper will describe the comprehensive results that were obtained using the various approaches and combinations, highlighting the respective benefits and limitations.


Healthcare ◽  
2021 ◽  
Vol 9 (9) ◽  
pp. 1107
Author(s):  
Tayla Anthony ◽  
Amit Kumar Mishra ◽  
Willem Stassen ◽  
Jarryd Son

This paper presents the application of machine learning for classifying time-critical conditions namely sepsis, myocardial infarction and cardiac arrest, based off transcriptions of emergency calls from emergency services dispatch centers in South Africa. In this study we present results from the application of four multi-class classification algorithms: Support Vector Machine (SVM), Logistic Regression, Random Forest and K-Nearest Neighbor (kNN). The application of machine learning for classifying time-critical diseases may allow for earlier identification, adequate telephonic triage, and quicker response times of the appropriate cadre of emergency care personnel. The data set consisted of an original data set of 93 examples which was further expanded through the use of data augmentation. Two feature extraction techniques were investigated namely; TF-IDF and handcrafted features. The results were further improved using hyper-parameter tuning and feature selection. In our work, within the limitations of a limited data set, classification results yielded an accuracy of up to 100% when training with 10-fold cross validation, and 95% accuracy when predicted on unseen data. The results are encouraging and show that automated diagnosis based on emergency dispatch centre transcriptions is feasible. When implemented in real time, this can have multiple utilities, e.g. enabling the call-takers to take the right action with the right priority.


2019 ◽  
Vol 21 (9) ◽  
pp. 662-669 ◽  
Author(s):  
Junnan Zhao ◽  
Lu Zhu ◽  
Weineng Zhou ◽  
Lingfeng Yin ◽  
Yuchen Wang ◽  
...  

Background: Thrombin is the central protease of the vertebrate blood coagulation cascade, which is closely related to cardiovascular diseases. The inhibitory constant Ki is the most significant property of thrombin inhibitors. Method: This study was carried out to predict Ki values of thrombin inhibitors based on a large data set by using machine learning methods. Taking advantage of finding non-intuitive regularities on high-dimensional datasets, machine learning can be used to build effective predictive models. A total of 6554 descriptors for each compound were collected and an efficient descriptor selection method was chosen to find the appropriate descriptors. Four different methods including multiple linear regression (MLR), K Nearest Neighbors (KNN), Gradient Boosting Regression Tree (GBRT) and Support Vector Machine (SVM) were implemented to build prediction models with these selected descriptors. Results: The SVM model was the best one among these methods with R2=0.84, MSE=0.55 for the training set and R2=0.83, MSE=0.56 for the test set. Several validation methods such as yrandomization test and applicability domain evaluation, were adopted to assess the robustness and generalization ability of the model. The final model shows excellent stability and predictive ability and can be employed for rapid estimation of the inhibitory constant, which is full of help for designing novel thrombin inhibitors.


Sensors ◽  
2021 ◽  
Vol 21 (14) ◽  
pp. 4655
Author(s):  
Dariusz Czerwinski ◽  
Jakub Gęca ◽  
Krzysztof Kolano

In this article, the authors propose two models for BLDC motor winding temperature estimation using machine learning methods. For the purposes of the research, measurements were made for over 160 h of motor operation, and then, they were preprocessed. The algorithms of linear regression, ElasticNet, stochastic gradient descent regressor, support vector machines, decision trees, and AdaBoost were used for predictive modeling. The ability of the models to generalize was achieved by hyperparameter tuning with the use of cross-validation. The conducted research led to promising results of the winding temperature estimation accuracy. In the case of sensorless temperature prediction (model 1), the mean absolute percentage error MAPE was below 4.5% and the coefficient of determination R2 was above 0.909. In addition, the extension of the model with the temperature measurement on the casing (model 2) allowed reducing the error value to about 1% and increasing R2 to 0.990. The results obtained for the first proposed model show that the overheating protection of the motor can be ensured without direct temperature measurement. In addition, the introduction of a simple casing temperature measurement system allows for an estimation with accuracy suitable for compensating the motor output torque changes related to temperature.


2018 ◽  
Vol 34 (3) ◽  
pp. 569-581 ◽  
Author(s):  
Sujata Rani ◽  
Parteek Kumar

Abstract In this article, an innovative approach to perform the sentiment analysis (SA) has been presented. The proposed system handles the issues of Romanized or abbreviated text and spelling variations in the text to perform the sentiment analysis. The training data set of 3,000 movie reviews and tweets has been manually labeled by native speakers of Hindi in three classes, i.e. positive, negative, and neutral. The system uses WEKA (Waikato Environment for Knowledge Analysis) tool to convert these string data into numerical matrices and applies three machine learning techniques, i.e. Naive Bayes (NB), J48, and support vector machine (SVM). The proposed system has been tested on 100 movie reviews and tweets, and it has been observed that SVM has performed best in comparison to other classifiers, and it has an accuracy of 68% for movie reviews and 82% in case of tweets. The results of the proposed system are very promising and can be used in emerging applications like SA of product reviews and social media analysis. Additionally, the proposed system can be used in other cultural/social benefits like predicting/fighting human riots.


Energies ◽  
2018 ◽  
Vol 11 (9) ◽  
pp. 2328 ◽  
Author(s):  
Md Shafiullah ◽  
M. Abido ◽  
Taher Abdel-Fattah

Precise information of fault location plays a vital role in expediting the restoration process, after being subjected to any kind of fault in power distribution grids. This paper proposed the Stockwell transform (ST) based optimized machine learning approach, to locate the faults and to identify the faulty sections in the distribution grids. This research employed the ST to extract useful features from the recorded three-phase current signals and fetches them as inputs to different machine learning tools (MLT), including the multilayer perceptron neural networks (MLP-NN), support vector machines (SVM), and extreme learning machines (ELM). The proposed approach employed the constriction-factor particle swarm optimization (CF-PSO) technique, to optimize the parameters of the SVM and ELM for their better generalization performance. Hence, it compared the obtained results of the test datasets in terms of the selected statistical performance indices, including the root mean squared error (RMSE), mean absolute percentage error (MAPE), percent bias (PBIAS), RMSE-observations to standard deviation ratio (RSR), coefficient of determination (R2), Willmott’s index of agreement (WIA), and Nash–Sutcliffe model efficiency coefficient (NSEC) to confirm the effectiveness of the developed fault location scheme. The satisfactory values of the statistical performance indices, indicated the superiority of the optimized machine learning tools over the non-optimized tools in locating faults. In addition, this research confirmed the efficacy of the faulty section identification scheme based on overall accuracy. Furthermore, the presented results validated the robustness of the developed approach against the measurement noise and uncertainties associated with pre-fault loading condition, fault resistance, and inception angle.


Sign in / Sign up

Export Citation Format

Share Document