Rock Strength Prediction in Real-Time while Drilling Employing Random Forest and Functional Network Techniques

2021 ◽  
pp. 1-21
Author(s):  
Hany Gamal ◽  
Ahmed Alsaihati ◽  
Salaheldin Elkatatny ◽  
Saleh Haidary ◽  
Abdulazeez Abdulraheem

Abstract The rock unconfined compressive strength (UCS) is one of the key parameters for geomechanical and reservoir modeling in the petroleum industry. Obtaining the UCS by conventional methods such as experimental work or empirical correlation from logging data are time consuming and highly cost. To overcome these drawbacks, this paper utilized the help of artificial intelligence (AI) to predict (in a real-time) the rock strength from the drilling parameters using two AI tools. Random forest (RF) based on principal component analysis (PCA), and functional network (FN) techniques were employed to build two UCS prediction models based on the drilling data such as weight on bit (WOB), drill string rotating-speed (RS), drilling torque (T), stand-pipe pressure (SPP), mud pumping rate (Q), and the rate of penetration (ROP). The models were built using 2,333 data points from well (A) with 70:30 training to testing ratio. The models were validated using unseen data set (1,300 data points) of Well (B) which is located in the same field and drilled across the same complex lithology. The results of the PCA-based RF model outperformed the FN in terms of correlation coefficient (R) and average absolute percentage error (AAPE). The overall accuracy for PCA-based RF was R of 0.99 and AAPE of 4.3 %, and for FN yielded R of 0.97 and AAPE of 8.5%. The validation results showed that R was 0.99 for RF and 0.96 for FN, while the AAPE was 4 and 7.9 % for RF and FN models, respectively. The developed PCA-based RF and FN models provide an accurate UCS estimation in real-time from the drilling data, saving time and cost and enhancing the well stability by generating UCS log from the rig drilling data.

2021 ◽  
pp. 1-13
Author(s):  
Hany Gamal ◽  
Ahmed Alsaihati ◽  
Salaheldin Elkatatny

Abstract The sonic data provides significant rock properties that are commonly used for designing the operational programs for drilling, rock fracturing, and development operations. The conventional methods for acquiring the rock sonic data in terms of compressional and shear slowness (ΔTc and ΔTs) are considered costly and time-consuming operations. The target of this paper is to proposed machine learning models for predicting the sonic logs from the drilling data in real-time. Decision tree (DT) and random forest (RF) were employed as train-based algorithms for building the sonic prediction models for drilling complex lithology rocks that have limestone, sandstone, shale, and carbonate formations. The input data for the models include the surface drilling parameters to predict the shear and compressional slowness. The study employed data set of 2888 data points for building and testing the model, while another collected 2863 data set was utilized for further validation for the sonic models. Sensitivity investigations were performed for DT and RF models to confirm optimal accuracy. The correlation of coefficient (R), and average absolute percentage error (AAPE) were used to check the models' accuracy between the actual values and models` outputs, in addition to, the sonic log profiles. The results indicated that the developed sonic models have a high capability for the sonic prediction from the drilling data as DT model recorded R higher than 0.967 and AAPE less than 2.76% for ΔTc and ΔTs models, while RF showed R higher than 0.991 with AAPE less than 1.07%. The further validation process for the developed models indicated the great results for the sonic prediction and RF model outperformed DT models as RF showed R higher than 0.986 with AAPE less than 1.12% while DT prediction recorded R greater than 0.93 with AAPE less than 1.95%. The sonic prediction through the developed models will save the cost and time for acquiring the sonic data through the conventional methods and will provide real-time estimation from the drilling parameters.


2021 ◽  
Author(s):  
Ahmed Al-Sabaa ◽  
Hany Gamal ◽  
Salaheldin Elkatatny

Abstract The formation porosity of drilled rock is an important parameter that determines the formation storage capacity. The common industrial technique for rock porosity acquisition is through the downhole logging tool. Usually logging while drilling, or wireline porosity logging provides a complete porosity log for the section of interest, however, the operational constraints for the logging tool might preclude the logging job, in addition to the job cost. The objective of this study is to provide an intelligent prediction model to predict the porosity from the drilling parameters. Artificial neural network (ANN) is a tool of artificial intelligence (AI) and it was employed in this study to build the porosity prediction model based on the drilling parameters as the weight on bit (WOB), drill string rotating-speed (RS), drilling torque (T), stand-pipe pressure (SPP), mud pumping rate (Q). The novel contribution of this study is to provide a rock porosity model for complex lithology formations using drilling parameters in real-time. The model was built using 2,700 data points from well (A) with 74:26 training to testing ratio. Many sensitivity analyses were performed to optimize the ANN model. The model was validated using unseen data set (1,000 data points) of Well (B), which is located in the same field and drilled across the same complex lithology. The results showed the high performance for the model either for training and testing or validation processes. The overall accuracy for the model was determined in terms of correlation coefficient (R) and average absolute percentage error (AAPE). Overall, R was higher than 0.91 and AAPE was less than 6.1 % for the model building and validation. Predicting the rock porosity while drilling in real-time will save the logging cost, and besides, will provide a guide for the formation storage capacity and interpretation analysis.


Author(s):  
Ahmed Alsaihati ◽  
◽  
Salaheldin Elkatatny ◽  

Mechanical rock properties are often determined using sonic log data—compressional velocity (VP) and shear velocity (VS). However, a sonic well log is not always acquired due to deteriorated hole condition (i.e., hole washout), sonic tool failures, especially in high-pressure, high-temperature (HPHT) wells, and relatively high cost. This paper introduces two data-driven models, namely artificial neural network (ANN) and random forest (RF), to estimate VP and VS across different formations that are characterized by deep burial depth and strong heterogeneity. Two types of actual field data were used to develop the models: (i) drilling surface parameters, which include flow rate, standpipe pressure, rotary speed, and surface torque, and (ii) acoustic velocities VP and VS, which were acquired by a conventional sonic log. Well-1 and Well-2 with data points of 6,846 were used to develop the models, while Well-3 with 1,016 data points was used to evaluate the capability of the developed models to generalize on an unseen data set with different statistical behavior. Furthermore, Well-3 was used to compare the accuracy of the developed models with the earliest published correlations in estimating the VS. The results showed that the RF outperformed the optimized ANN in estimating VP and VS in Well-3. The RF predicted the VP with a low average absolute percentage error (AAPE) of 0.9% and correlation of coefficient (R) of 0.87, while the AAPE and R were 6.7 % and 0.45 in the case of ANN. Similarly, the RF estimated the VS with an AAPE of 1.1% and R of 0.85, whereas the ANN predicted the VS with an AAPE of 9.5% and R of 0.40. Furthermore, the RF was the most accurate in determining VS in Well-3 compared to the earliest published correlations.


2021 ◽  
Author(s):  
Temirlan Zhekenov ◽  
Artem Nechaev ◽  
Kamilla Chettykbayeva ◽  
Alexey Zinovyev ◽  
German Sardarov ◽  
...  

SUMMARY Researchers base their analysis on basic drilling parameters obtained during mud logging and demonstrate impressive results. However, due to limitations imposed by data quality often present during drilling, those solutions often tend to lose their stability and high levels of predictivity. In this work, the concept of hybrid modeling was introduced which allows to integrate the analytical correlations with algorithms of machine learning for obtaining stable solutions consistent from one data set to another.


2011 ◽  
Vol 20 (04) ◽  
pp. 753-781
Author(s):  
KAI CHEN ◽  
KIA MAKKI ◽  
NIKI PISSINOU

In the metropolitan region, most congestion or traffic jams are caused by the uneven distribution of traffic flow that creates bottleneck points where the traffic volume exceeds the road capacity. Additionally, unexpected incidents are the next most probable cause of these bottleneck regions. Moreover, most drivers are driving based on their empirical experience without awareness of real-time traffic situations. This unintelligent traffic behavior can make the congestion problem worse. Prediction based route guidance systems show great improvements in solving the inefficient diversion strategy problem by estimating future travel time when calculating accurate travel time is difficult. However, performances of machine learning based prediction models that are based on the historical data set degrade sharply during a congestion situation. This paper develops a new navigation system for reducing travel time of an individual driver and distributing the flow of urban traffic efficiently in order to reduce the occurrence of congestion. Compared with previous route guidance systems, the results reveal that our system, applying the advanced multi-lane prediction based real-time fastest path (AMPRFP) algorithm, can significantly reduce the travel time especially when drivers travel in a complex route environment and face frequent congestion problems. Unlike the previous system,1 it can be applied either for single lane or multi-lane urban traffic networks where the reason for congestion is significantly complex. We also demonstrate the advantages of this system and verify the results using real highway traffic data and a synthetic experiment.


2020 ◽  
Vol 28 (6) ◽  
pp. 1273-1291
Author(s):  
Nesreen El-Rayes ◽  
Ming Fang ◽  
Michael Smith ◽  
Stephen M. Taylor

Purpose The purpose of this study is to develop tree-based binary classification models to predict the likelihood of employee attrition based on firm cultural and management attributes. Design/methodology/approach A data set of resumes anonymously submitted through Glassdoor’s online portal is used in tandem with public company review information to fit decision tree, random forest and gradient boosted tree models to predict the probability of an employee leaving a firm during a job transition. Findings Random forest and decision tree methods are found to be the strongest attrition prediction models. In addition, compensation, company culture and senior management performance play a primary role in an employee’s decision to leave a firm. Practical implications This study may be used by human resources staff to better understand factors which influence employee attrition. In addition, techniques developed in this study may be applied to company-specific data sets to construct customized attrition models. Originality/value This study contains several novel contributions which include exploratory studies such as industry job transition percentages, distributional comparisons between factors strongly contributing to employee attrition between those who left or stayed with the firm and the first comprehensive search over binary classification models to identify which provides the strongest predictive performance of employee attrition.


2021 ◽  
pp. 1-14
Author(s):  
Ahmed Farid Ibrahim ◽  
Salaheldin Elkatatny ◽  
Yasmin Abdelraouf ◽  
Mustafa Al Ramadan

Abstract Water saturation (Sw) is a vital factor for the hydrocarbon in-place calculations. Sw is usually calculated using different equations; however, its values have been inconsistent with the experimental results due to often incorrectness of their underlying assumptions. Moreover, the main hindrance remains in these approaches due to their strong reliance on experimental analysis which are expensive and time-consuming. This study introduces the application of different machine learning (ML) methods to predict Sw from the conventional well logs. Function networks (FN), support vector machine (SVM), and random forests (RF) were implemented to calculate the Sw using gamma-ray (GR) log, Neutron porosity (NPHI) log, and resistivity (Rt) log. A dataset of 782 points from two wells (Well-1 and Well-2) in tight gas sandstone formation was used to build and then validate the different ML models. The data set from Well-1 was applied for the ML models training and testing, then the unseen data from well-2 was used to validate the developed models. The results from FN, SVM and RF models showed their capability of accurately predicting the Sw from the conventional well logging data. The correlation coefficient (R) values between actual and estimated Sw from the FN model were found to be 0.85 and 0.83 compared to 0.98, and 0.95 from the RF model in the case of training and testing sets, respectively. SVM model shows an R-value of 0.95 and 0.85 in the different datasets. The average absolute percentage error (AAPE) was less than 8% in the three ML models. The ML models outperform the empirical correlations that have AAPE greater than 19%. This study provides ML applications to accurately forecast the water saturation using the readily available conventional well logs without additional core analysis or well site interventions.


2018 ◽  
Author(s):  
Michal Kačmařík ◽  
Jan Douša ◽  
Florian Zus ◽  
Pavel Václavovic ◽  
Kyriakos Balidakis ◽  
...  

Abstract. An analysis of processing settings impact on estimated tropospheric gradients is presented. The study is based on the benchmark data set collected within the COST GNSS4SWEC action with observations from 430 GNSS reference stations in central Europe for May and June 2013. Tropospheric gradients were estimated in eight different variants of GNSS data processing using Precise Point Positioning with the G-Nut/Tefnut software. The impact of the gradient mapping function, elevation cut-off angle, GNSS constellation and real-time versus post-processing mode were assessed by comparing the variants by each to other and by evaluating them with respect to tropospheric gradients derived from two numerical weather prediction models. Generally, all the solutions in the post-processing mode provided a robust tropospheric gradient estimation with a clear relation to real weather conditions. The quality of tropospheric gradient estimates in real-time mode mainly depends on the actual quality of the real-time orbits and clocks. Best results were achieved using the 3° elevation angle cut-off and a combined GPS + GLONASS constellation. Systematic effects of up to 0.3 mm were observed in estimated tropospheric gradients when using different gradient mapping functions which depend on the applied observation elevation-dependent weighting. While the latitudinal troposphere tilting causes a systematic difference in the north gradient component on a global scale, large local wet gradients pointing to a direction of increased humidity cause systematic differences in both gradient components depending on the gradient direction.


Author(s):  
Fairoz Q. Kareem ◽  
Adnan Mohsin Abdulazeez ◽  
Dathar A. Hasan

Weather forecasting is the process of predicting the status of the atmosphere for certain regions or locations by utilizing recent technology. Thousands of years ago, humans tried to foretell the weather state in some civilizations by studying the science of stars and astronomy. Realizing the weather conditions has a direct impact on many fields, such as commercial, agricultural, airlines, etc. With the recent development in technology, especially in the DM and machine learning techniques, many researchers proposed weather forecasting prediction systems based on data mining classification techniques. In this paper, we utilized neural networks, Naïve Bayes, random forest, and K-nearest neighbor algorithms to build weather forecasting prediction models. These models classify the unseen data instances to multiple class rain, fog, partly-cloudy day, clear-day and cloudy. These model performance for each algorithm has been trained and tested using synoptic data from the Kaggle website. This dataset contains (1796) instances and (8) attributes in our possession. Comparing with other algorithms, the Random forest algorithm achieved the best performance accuracy of 89%. These results indicate the ability of data mining classification algorithms to present optimal tools to predict weather forecasting.


10.2196/28856 ◽  
2021 ◽  
Vol 23 (6) ◽  
pp. e28856
Author(s):  
Zahid Ullah ◽  
Farrukh Saleem ◽  
Mona Jamjoom ◽  
Bahjat Fakieh

Background The use of artificial intelligence has revolutionized every area of life such as business and trade, social and electronic media, education and learning, manufacturing industries, medicine and sciences, and every other sector. The new reforms and advanced technologies of artificial intelligence have enabled data analysts to transmute raw data generated by these sectors into meaningful insights for an effective decision-making process. Health care is one of the integral sectors where a large amount of data is generated daily, and making effective decisions based on these data is therefore a challenge. In this study, cases related to childbirth either by the traditional method of vaginal delivery or cesarean delivery were investigated. Cesarean delivery is performed to save both the mother and the fetus when complications related to vaginal birth arise. Objective The aim of this study was to develop reliable prediction models for a maternity care decision support system to predict the mode of delivery before childbirth. Methods This study was conducted in 2 parts for identifying the mode of childbirth: first, the existing data set was enriched and second, previous medical records about the mode of delivery were investigated using machine learning algorithms and by extracting meaningful insights from unseen cases. Several prediction models were trained to achieve this objective, such as decision tree, random forest, AdaBoostM1, bagging, and k-nearest neighbor, based on original and enriched data sets. Results The prediction models based on enriched data performed well in terms of accuracy, sensitivity, specificity, F-measure, and receiver operating characteristic curves in the outcomes. Specifically, the accuracy of k-nearest neighbor was 84.38%, that of bagging was 83.75%, that of random forest was 83.13%, that of decision tree was 81.25%, and that of AdaBoostM1 was 80.63%. Enrichment of the data set had a good impact on improving the accuracy of the prediction process, which supports maternity care practitioners in making decisions in critical cases. Conclusions Our study shows that enriching the data set improves the accuracy of the prediction process, thereby supporting maternity care practitioners in making informed decisions in critical cases. The enriched data set used in this study yields good results, but this data set can become even better if the records are increased with real clinical data.


Sign in / Sign up

Export Citation Format

Share Document