Development of Machine Learning Based Propped Fracture Conductivity Correlations in Shale Formations

2021 ◽  
Author(s):  
Mahmoud Desouky ◽  
Zeeshan Tariq ◽  
Murtada Al jawad ◽  
Hamed Alhoori ◽  
Mohamed Mahmoud ◽  
...  

Abstract Propped hydraulic fracturing is a stimulation technique used in tight formations to create conductive fractures. To predict the fractured well productivity, the conductivity of those propped fractures should be estimated. It is common to measure the conductivity of propped fractures in the laboratory under controlled conditions. Nonetheless, it is costly and time-consuming which encouraged developing many empirical and analytical propped fracture conductivity models. Previous empirical models, however, were based on limited datasets producing questionable correlations. We propose herein new empirical models based on an extensive data set utilizing machine learning (ML) methods. In this study, an artificial neural network (ANN) was utilized. A dataset comprised of 351 data points of propped hydraulic fracture experiments on different shale types with different mineralogy under various confining stresses was collected and studied. Several statistical and data science approaches such as box and whisker plots, correlation crossplots, and Z-score techniques were used to remove the outliers and extreme data points. The performance of the developed model was evaluated using powerful metrics such as correlation coefficient and root mean squared error. After several executions and function evaluations, an ANN was found to be the best technique to predict propped fracture conductivity for different mineralogy. The proposed ANN models resulted in less than 7% error between actual and predicted values. In this study, in addition to the development of an optimized ANN model, explicit empirical correlations are also extracted from the weights and biases of the fine-tuned model. The proposed model of propped fracture conductivity was then compared with the commonly available correlations. The results revealed that the proposed mineralogy based propped fracture conductivity models made the predictions with a high correlation coefficient of 94%. This work clearly shows the potential of computer-based ML techniques in the determination of mineralogy based propped fracture conductivity. The proposed empirical correlation can be implemented without requiring any ML-based software.

2014 ◽  
Vol 7 (4) ◽  
pp. 132-143
Author(s):  
ABBAS M. ABD ◽  
SAAD SH. SAMMEN

The prediction of different hydrological phenomenon (or system) plays an increasing role in the management of water resources. As engineers; it is required to predict the component of natural reservoirs’ inflow for numerous purposes. Resulting prediction techniques vary with the potential purpose, characteristics, and documented data. The best prediction method is of interest of experts to overcome the uncertainty, because the most hydrological parameters are subjected to the uncertainty. Artificial Neural Network (ANN) approach has adopted in this paper to predict Hemren reservoir inflow. Available data including monthly discharge supplied from DerbendiKhan reservoir and rain fall intensity falling on the intermediate catchment area between Hemren-DerbendiKhan dams were used.A Back Propagation (LMBP) algorithm (Levenberg-Marquardt) has been utilized to construct the ANN models. For the developed ANN model, different networks with different numbers of neurons and layers were evaluated. A total of 24 years of historical data for interval from 1980 to 2004 were used to train and test the networks. The optimum ANN network with 3 inputs, 40 neurons in both two hidden layers and one output was selected. Mean Squared Error (MSE) and the Correlation Coefficient (CC) were employed to evaluate the accuracy of the proposed model. The network was trained and converged at MSE = 0.027 by using training data subjected to early stopping approach. The network could forecast the testing data set with the accuracy of MSE = 0.031. Training and testing process showed the correlation coefficient of 0.97 and 0.77 respectively and this is refer to a high precision of that prediction technique.


Author(s):  
Ritu Khandelwal ◽  
Hemlata Goyal ◽  
Rajveer Singh Shekhawat

Introduction: Machine learning is an intelligent technology that works as a bridge between businesses and data science. With the involvement of data science, the business goal focuses on findings to get valuable insights on available data. The large part of Indian Cinema is Bollywood which is a multi-million dollar industry. This paper attempts to predict whether the upcoming Bollywood Movie would be Blockbuster, Superhit, Hit, Average or Flop. For this Machine Learning techniques (classification and prediction) will be applied. To make classifier or prediction model first step is the learning stage in which we need to give the training data set to train the model by applying some technique or algorithm and after that different rules are generated which helps to make a model and predict future trends in different types of organizations. Methods: All the techniques related to classification and Prediction such as Support Vector Machine(SVM), Random Forest, Decision Tree, Naïve Bayes, Logistic Regression, Adaboost, and KNN will be applied and try to find out efficient and effective results. All these functionalities can be applied with GUI Based workflows available with various categories such as data, Visualize, Model, and Evaluate. Result: To make classifier or prediction model first step is learning stage in which we need to give the training data set to train the model by applying some technique or algorithm and after that different rules are generated which helps to make a model and predict future trends in different types of organizations Conclusion: This paper focuses on Comparative Analysis that would be performed based on different parameters such as Accuracy, Confusion Matrix to identify the best possible model for predicting the movie Success. By using Advertisement Propaganda, they can plan for the best time to release the movie according to the predicted success rate to gain higher benefits. Discussion: Data Mining is the process of discovering different patterns from large data sets and from that various relationships are also discovered to solve various problems that come in business and helps to predict the forthcoming trends. This Prediction can help Production Houses for Advertisement Propaganda and also they can plan their costs and by assuring these factors they can make the movie more profitable.


2021 ◽  
Vol 21 (1) ◽  
Author(s):  
Ann-Marie Mallon ◽  
Dieter A. Häring ◽  
Frank Dahlke ◽  
Piet Aarden ◽  
Soroosh Afyouni ◽  
...  

Abstract Background Novartis and the University of Oxford’s Big Data Institute (BDI) have established a research alliance with the aim to improve health care and drug development by making it more efficient and targeted. Using a combination of the latest statistical machine learning technology with an innovative IT platform developed to manage large volumes of anonymised data from numerous data sources and types we plan to identify novel patterns with clinical relevance which cannot be detected by humans alone to identify phenotypes and early predictors of patient disease activity and progression. Method The collaboration focuses on highly complex autoimmune diseases and develops a computational framework to assemble a research-ready dataset across numerous modalities. For the Multiple Sclerosis (MS) project, the collaboration has anonymised and integrated phase II to phase IV clinical and imaging trial data from ≈35,000 patients across all clinical phenotypes and collected in more than 2200 centres worldwide. For the “IL-17” project, the collaboration has anonymised and integrated clinical and imaging data from over 30 phase II and III Cosentyx clinical trials including more than 15,000 patients, suffering from four autoimmune disorders (Psoriasis, Axial Spondyloarthritis, Psoriatic arthritis (PsA) and Rheumatoid arthritis (RA)). Results A fundamental component of successful data analysis and the collaborative development of novel machine learning methods on these rich data sets has been the construction of a research informatics framework that can capture the data at regular intervals where images could be anonymised and integrated with the de-identified clinical data, quality controlled and compiled into a research-ready relational database which would then be available to multi-disciplinary analysts. The collaborative development from a group of software developers, data wranglers, statisticians, clinicians, and domain scientists across both organisations has been key. This framework is innovative, as it facilitates collaborative data management and makes a complicated clinical trial data set from a pharmaceutical company available to academic researchers who become associated with the project. Conclusions An informatics framework has been developed to capture clinical trial data into a pipeline of anonymisation, quality control, data exploration, and subsequent integration into a database. Establishing this framework has been integral to the development of analytical tools.


2021 ◽  
Author(s):  
Hangsik Shin

BACKGROUND Arterial stiffness due to vascular aging is a major indicator for evaluating cardiovascular risk. OBJECTIVE In this study, we propose a method of estimating age by applying machine learning to photoplethysmogram for non-invasive vascular age assessment. METHODS The machine learning-based age estimation model that consists of three convolutional layers and two-layer fully connected layers, was developed using segmented photoplethysmogram by pulse from a total of 752 adults aged 19–87 years. The performance of the developed model was quantitatively evaluated using mean absolute error, root-mean-squared-error, Pearson’s correlation coefficient, coefficient of determination. The Grad-Cam was used to explain the contribution of photoplethysmogram waveform characteristic in vascular age estimation. RESULTS Mean absolute error of 8.03, root mean squared error of 9.96, 0.62 of correlation coefficient, and 0.38 of coefficient of determination were shown through 10-fold cross validation. Grad-Cam, used to determine the weight that the input signal contributes to the result, confirmed that the contribution to the age estimation of the photoplethysmogram segment was high around the systolic peak. CONCLUSIONS The machine learning-based vascular aging analysis method using the PPG waveform showed comparable or superior performance compared to previous studies without complex feature detection in evaluating vascular aging. CLINICALTRIAL 2015-0104


2021 ◽  
Vol 20 ◽  
pp. 415-430
Author(s):  
Juthaphorn Sinsomboonthong ◽  
Saichon Sinsomboonthong

The proposed estimator, namely weighted maximum likelihood (WML) correlation coefficient, for measuring the relationship between two variables to concern about missing values and outliers in the dataset is presented. This estimator is proven by applying the conditional probability function to take care of some missing values and pay more attention to values near the center. However, outliers in the dataset are assigned a slight weight. These using techniques will give the robust proposed method when the preliminary assumptions are not met data analysis. To inspect about the quality of the proposed estimator, the six methods—WML, Pearson, median, percentage bend, biweight mid, and composite correlation coefficients—are compared the properties in two criteria, i.e. the bias and mean squared error, via the simulation study. The results of generated data are illustrated that the WML estimator seems to have the best performance to withstand the missing values and outliers in dataset, especially for the tiny sample size and large percentage of outliers regardless of missing data levels. However, for the massive sample size, the median correlation coefficient seems to have the good estimator when linear relationship levels between two variables are approximately over 0.4 irrespective of outliers and missing data levels


Author(s):  
Yang-Hui He

Calabi-Yau spaces, or Kähler spaces admitting zero Ricci curvature, have played a pivotal role in theoretical physics and pure mathematics for the last half century. In physics, they constituted the first and natural solution to compactification of superstring theory to our 4-dimensional universe, primarily due to one of their equivalent definitions being the admittance of covariantly constant spinors. Since the mid-1980s, physicists and mathematicians have joined forces in creating explicit examples of Calabi-Yau spaces, compiling databases of formidable size, including the complete intersecion (CICY) data set, the weighted hypersurfaces data set, the elliptic-fibration data set, the Kreuzer-Skarke toric hypersurface data set, generalized CICYs, etc., totaling at least on the order of 1010 manifolds. These all contribute to the vast string landscape, the multitude of possible vacuum solutions to string compactification. More recently, this collaboration has been enriched by computer science and data science, the former in bench-marking the complexity of the algorithms in computing geometric quantities, and the latter in applying techniques such as machine learning in extracting unexpected information. These endeavours, inspired by the physics of the string landscape, have rendered the investigation of Calabi-Yau spaces one of the most exciting and interdisciplinary fields.


2021 ◽  
Vol 11 (24) ◽  
pp. 11710
Author(s):  
Matteo Miani ◽  
Matteo Dunnhofer ◽  
Fabio Rondinella ◽  
Evangelos Manthos ◽  
Jan Valentin ◽  
...  

This study introduces a machine learning approach based on Artificial Neural Networks (ANNs) for the prediction of Marshall test results, stiffness modulus and air voids data of different bituminous mixtures for road pavements. A novel approach for an objective and semi-automatic identification of the optimal ANN’s structure, defined by the so-called hyperparameters, has been introduced and discussed. Mechanical and volumetric data were obtained by conducting laboratory tests on 320 Marshall specimens, and the results were used to train the neural network. The k-fold Cross Validation method has been used for partitioning the available data set, to obtain an unbiased evaluation of the model predictive error. The ANN’s hyperparameters have been optimized using the Bayesian optimization, that overcame efficiently the more costly trial-and-error procedure and automated the hyperparameters tuning. The proposed ANN model is characterized by a Pearson coefficient value of 0.868.


2021 ◽  
Vol 2070 (1) ◽  
pp. 012145
Author(s):  
R Shiva Shankar ◽  
CH Raminaidu ◽  
VV Sivarama Raju ◽  
J Rajanikanth

Abstract Epilepsy is a chronic neurological illness that affects millions of people throughout the world. Epilepsy affects around 50 million people globally. It is estimated that if epilepsy is correctly diagnosed and treated, up to 70% of people with the condition will be seizure-free. There is a need to detect epilepsy at the initial stages to reduce symptoms by medications and other strategies. We use Epileptic Seizure Recognition dataset to train the model which is provided by UCI Machine Learning Repository. There are 179 attributes and 11,500 unique values in this dataset. MLP, PCA with RF, QDA, LDA, and PCA with ANN were applied among them; PCA with ANN provided the better metrics. For the metrics, we received the following findings. It is 97.55% Accuracy, 94.24% Precision, 91.48% recall, 83.38% hinge loss, and 2.32% mean squared error.


2021 ◽  
Vol 2021 ◽  
pp. 1-13
Author(s):  
Andrei Bratu ◽  
Gabriela Czibula

Data augmentation is a commonly used technique in data science for improving the robustness and performance of machine learning models. The purpose of the paper is to study the feasibility of generating synthetic data points of temporal nature towards this end. A general approach named DAuGAN (Data Augmentation using Generative Adversarial Networks) is presented for identifying poorly represented sections of a time series, studying the synthesis and integration of new data points, and performance improvement on a benchmark machine learning model. The problem is studied and applied in the domain of algorithmic trading, whose constraints are presented and taken into consideration. The experimental results highlight an improvement in performance on a benchmark reinforcement learning agent trained on a dataset enhanced with DAuGAN to trade a financial instrument.


Author(s):  
Jonathan M. Gumley ◽  
Hayden Marcollo ◽  
Stuart Wales ◽  
Andrew E. Potts ◽  
Christopher J. Carra

Abstract There is growing importance in the offshore floating production sector to develop reliable and robust means of continuously monitoring the integrity of mooring systems for FPSOs and FPUs, particularly in light of the upcoming introduction of API-RP-2MIM. Here, the limitations of the current range of monitoring techniques are discussed, including well established technologies such as load cells, sonar, or visual inspection, within the context of the growing mainstream acceptance of data science and machine learning. Due to the large fleet of floating production platforms currently in service, there is a need for a readily deployable solution that can be retrofitted to existing platforms to passively monitor the performance of floating assets on their moorings, for which machine learning based systems have particular advantages. An earlier investigation conducted in 2016 on a shallow water, single point moored FPSO employed host facility data from in-service field measurements before and after a single mooring line failure event. This paper presents how the same machine learning techniques were applied to a deep water, semi taut, spread moored system where there was no host facility data available, therefore requiring a calibrated hydrodynamic numerical model to be used as the basis for the training data set. The machine learning techniques applied to both real and synthetically generated data were successful in replicating the response of the original system, even with the latter subjected to different variations of artificial noise. Furthermore, utilizing a probability-based approach, it was demonstrated that replicating the response of the underlying system was a powerful technique for predicting changes in the mooring system.


Sign in / Sign up

Export Citation Format

Share Document