Successful Application of Machine Learning to Improve Dynamic Modeling and History Matching for Complex Gas-Condensate Reservoirs in Hai Thach Field, Nam Con Son Basin, Offshore Vietnam

2021 ◽  
Author(s):  
Son K. Hoang ◽  
Tung V. Tran ◽  
Tan N. Nguyen ◽  
Tu A. Truong ◽  
Duy H. Pham ◽  
...  

Abstract This study aims to apply machine learning (ML) to make history matching (HM) process easier, faster, more accurate, and more reliable by determining whether Local Grid Refinement (LGR) with transmissibility multiplier is needed to history match gas-condensate wells producing from geologically complex reservoirs and determining how LGR should be set up to successfully history match those production wells. The main challenges for HM gas-condensate production from Hai Thach wells are large effect of condensate banking (condensate blockage), flow baffles by the sub-seismic fault network, complex reservoir distribution and connectivity, highly uncertain HIIP, and lack of PVT information for most reservoirs. In this study, ML was applied to analyze production data using synthetic samples generated by a very large number of compositional sector models so that the need for LGR could be identified before the HM process and the required LGR setup could also be determined. The proposed method helped provide better models in a much shorter time, and improved the efficiency and reliability of the dynamic modeling process. 500+ synthetic samples were generated using compositional sector models and divided into training and test sets. Supervised classification algorithms including logistic regression, Gaussian, Bernoulli, and multinomial Naïve Bayes, linear discriminant analysis, support vector machine, K-nearest neighbors, and Decision Tree as well as ANN were applied to the data sets to determine the need for using LGR in HM. The best algorithm was found to be the Decision Tree classifier, with 100% and 99% accuracy on the training and the test sets, respectively. The size of the LGR area could also be determined reasonably well at 89% and 87% accuracy on the training and the test sets, respectively. The range of the transmissibility multiplier could also be determined reasonably well at 97% and 91% accuracy on the training and the test sets, respectively. Moreover, the ML model was validated using actual production and HM data. A new method of applying ML in dynamic modeling and HM of challenging gas-condensate wells in geologically complex reservoirs has been successfully applied to the high-pressure high-temperature Hai Thach field offshore Vietnam. The proposed method helped reduce many trial and error simulation runs and provide better and more reliable dynamic models.

2021 ◽  
Author(s):  
Son Hoang ◽  
Tung Tran ◽  
Tan Nguyen ◽  
Tu Truong ◽  
Duy Pham ◽  
...  

Abstract This paper reports a successful case study of applying machine learning to improve the history matching process, making it easier, less time-consuming, and more accurate, by determining whether Local Grid Refinement (LGR) with transmissibility multiplier is needed to history match gas-condensate wells producing from geologically complex reservoirs as well as determining the required LGR setup to history match those gas-condensate producers. History matching Hai Thach gas-condensate production wells is extremely challenging due to the combined effect of condensate banking, sub-seismic fault network, complex reservoir distribution and connectivity, uncertain HIIP, and lack of PVT data for most reservoirs. In fact, for some wells, many trial simulation runs were conducted before it became clear that LGR with transmissibility multiplier was required to obtain good history matching. In order to minimize this time-consuming trial-and-error process, machine learning was applied in this study to analyze production data using synthetic samples generated by a very large number of compositional sector models so that the need for LGR could be identified before the history matching process begins. Furthermore, machine learning application could also determine the required LGR setup. The method helped provide better models in a much shorter time, and greatly improved the efficiency and reliability of the dynamic modeling process. More than 500 synthetic samples were generated using compositional sector models and divided into separate training and test sets. Multiple classification algorithms such as logistic regression, Gaussian Naive Bayes, Bernoulli Naive Bayes, multinomial Naive Bayes, linear discriminant analysis, support vector machine, K-nearest neighbors, and Decision Tree as well as artificial neural networks were applied to predict whether LGR was used in the sector models. The best algorithm was found to be the Decision Tree classifier, with 100% accuracy on the training set and 99% accuracy on the test set. The LGR setup (size of LGR area and range of transmissibility multiplier) was also predicted best by the Decision Tree classifier with 91% accuracy on the training set and 88% accuracy on the test set. The machine learning model was validated using actual production data and the dynamic models of history-matched wells. Finally, using the machine learning prediction on wells with poor history matching results, their dynamic models were updated and significantly improved.


Heart disease is a common problem which can be very severe in old ages and also in people not having a healthy lifestyle. With regular check-up and diagnosis in addition to maintaining a decent eating habit can prevent it to some extent. In this paper we have tried to implement the most sought after and important machine learning algorithm to predict the heart disease in a patient. The decision tree classifier is implemented based on the symptoms which are specifically the attributes required for the purpose of prediction. Using the decision tree algorithm, we will be able to identify those attributes which are the best one that will lead us to a better prediction of the datasets. The decision tree algorithm works in a way where it tries to solve the problem by the help of tree representation. Here each internal node of the tree represents an attribute, and each leaf node corresponds to a class label. The support vector machine algorithm helps us to classify the datasets on the basis of kernel and it also groups the dataset using hyperplane. The main objective of this project is to try and reduce the number of occurrences of the heart diseases in patients


2019 ◽  
Vol 19 (1) ◽  
pp. 54-61
Author(s):  
Pakarti Riswanto ◽  
RZ. Abdul Aziz ◽  
Sriyanto -

In the field of medicine, the use of data mining has a quite important and evolutionary role that can change the perspective of doctors, practitioners and health researchers in the process of detecting breast cancer in a patient. There are 2 classification applications in it, namely the process of diagnosing (diagnosing) cancer cells that distinguishes between tumors (benign cancer) or malignant cancer and prognosis (prognosis) to determine the possibility of reappearance of cancer cells in patients who have been operated on in the future. Data mining aims to describe new findings in the dataset and explain a process that uses statistical, mathematical, artificial intelligence, and machine learning techniques to extract and identify useful information and related knowledge from the database.Classification with data mining can be done using several methods, namely Decision Tree, K-Nearest Neighbor, Naive Bayes, ID3, CART, Linear Discriminant Analysis, etc., which certainly have advantages and disadvantages of each. But in this study, the author focuses on the classification of data mining using the Support Vector Mechine and Deccision Tree algorithms.This study will analyze the Breast Cancer Wisconsin Original data set obtained from the UCI Machine Learning Repository (repository of research data) to classify breast cancer malignancies. This time the author correlates between the Decision Tree classifier algorithm which has good ability to process large databases as a feature selection, then with a proper and relevant SVM Method used in analyzing and diagnosing breast breast cancer patients because it has accurate results for existing problems and several bases . Keywords— Data Mining, diagnosis, Decision Tree, SVM Method


Electronics ◽  
2021 ◽  
Vol 10 (6) ◽  
pp. 699
Author(s):  
Yogendra Singh Solanki ◽  
Prasun Chakrabarti ◽  
Michal Jasinski ◽  
Zbigniew Leonowicz ◽  
Vadim Bolshev ◽  
...  

Nowadays, breast cancer is the most frequent cancer among women. Early detection is a critical issue that can be effectively achieved by machine learning (ML) techniques. Thus in this article, the methods to improve the accuracy of ML classification models for the prognosis of breast cancer are investigated. Wrapper-based feature selection approach along with nature-inspired algorithms such as Particle Swarm Optimization, Genetic Search, and Greedy Stepwise has been used to identify the important features. On these selected features popular machine learning classifiers Support Vector Machine, J48 (C4.5 Decision Tree Algorithm), Multilayer-Perceptron (a feed-forward ANN) were used in the system. The methodology of the proposed system is structured into five stages which include (1) Data Pre-processing; (2) Data imbalance handling; (3) Feature Selection; (4) Machine Learning Classifiers; (5) classifier’s performance evaluation. The dataset under this research experimentation is referred from the UCI Machine Learning Repository, named Breast Cancer Wisconsin (Diagnostic) Data Set. This article indicated that the J48 decision tree classifier is the appropriate machine learning-based classifier for optimum breast cancer prognosis. Support Vector Machine with Particle Swarm Optimization algorithm for feature selection achieves the accuracy of 98.24%, MCC = 0.961, Sensitivity = 99.11%, Specificity = 96.54%, and Kappa statistics of 0.9606. It is also observed that the J48 Decision Tree classifier with the Genetic Search algorithm for feature selection achieves the accuracy of 98.83%, MCC = 0.974, Sensitivity = 98.95%, Specificity = 98.58%, and Kappa statistics of 0.9735. Furthermore, Multilayer Perceptron ANN classifier with Genetic Search algorithm for feature selection achieves the accuracy of 98.59%, MCC = 0.968, Sensitivity = 98.6%, Specificity = 98.57%, and Kappa statistics of 0.9682.


Author(s):  
Cristián Castillo-Olea ◽  
Begonya Garcia-Zapirain Soto ◽  
Clemente Zuñiga

The article presents a study based on timeline data analysis of the level of sarcopenia in older patients in Baja California, Mexico. Information was examined at the beginning of the study (first event), three months later (second event), and six months later (third event). Sarcopenia is defined as the loss of muscle mass quality and strength. The study was conducted with 166 patients. A total of 65% were women and 35% were men. The mean age of the enrolled patients was 77.24 years. The research included 99 variables that consider medical history, pharmacology, psychological tests, comorbidity (Charlson), functional capacity (Barthel and Lawton), undernourishment (mini nutritional assessment (MNA) validated test), as well as biochemical and socio-demographic data. Our aim was to evaluate the prevalence of the level of sarcopenia in a population of chronically ill patients assessed at the Tijuana General Hospital. We used machine learning techniques to assess and identify the determining variables to focus on the patients’ evolution. The following classifiers were used: Support Vector Machines, Linear Support Vector Machines, Radial Basis Function, Gaussian process, Decision Tree, Random Forest, multilayer perceptron, AdaBoost, Gaussian Naive Bayes, and Quadratic Discriminant Analysis. In order of importance, we found that the following variables determine the level of sarcopenia: Age, Systolic arterial hypertension, mini nutritional assessment (MNA), Number of chronic diseases, and Sodium. They are therefore considered relevant in the decision-making process of choosing treatment or prevention. Analysis of the relationship between the presence of the variables and the classifiers used to measure sarcopenia revealed that the Decision Tree classifier, with the Age, Systolic arterial hypertension, MNA, Number of chronic diseases, and Sodium variables, showed a precision of 0.864, accuracy of 0.831, and an F1 score of 0.900 in the first and second events. Precision of 0.867, accuracy of 0.825, and an F1 score of 0.867 were obtained in event three with the same variables. We can therefore conclude that the Decision Tree classifier yields the best results for the assessment of the determining variables and suggests that the study population’s sarcopenia did not change from moderate to severe.


2021 ◽  
Vol 18 (2(Suppl.)) ◽  
pp. 0884
Author(s):  
Raja Azlina Raja Mahmood ◽  
AmirHossien Abdi ◽  
Masnida Hussin

Some of the main challenges in developing an effective network-based intrusion detection system (IDS) include analyzing large network traffic volumes and realizing the decision boundaries between normal and abnormal behaviors. Deploying feature selection together with efficient classifiers in the detection system can overcome these problems.  Feature selection finds the most relevant features, thus reduces the dimensionality and complexity to analyze the network traffic.  Moreover, using the most relevant features to build the predictive model, reduces the complexity of the developed model, thus reducing the building classifier model time and consequently improves the detection performance.  In this study, two different sets of selected features have been adopted to train four machine-learning based classifiers.  The two sets of selected features are based on Genetic Algorithm (GA) and Particle Swarm Optimization (PSO) approach respectively.  These evolutionary-based algorithms are known to be effective in solving optimization problems.  The classifiers used in this study are Naïve Bayes, k-Nearest Neighbor, Decision Tree and Support Vector Machine that have been trained and tested using the NSL-KDD dataset. The performance of the abovementioned classifiers using different features values was evaluated.  The experimental results indicate that the detection accuracy improves by approximately 1.55% when implemented using the PSO-based selected features than that of using GA-based selected features.  The Decision Tree classifier that was trained with PSO-based selected features outperformed other classifiers with accuracy, precision, recall, and f-score result of 99.38%, 99.36%, 99.32%, and 99.34% respectively.  The results show that using optimal features coupling with a good classifier in a detection system able to reduce the classifier model building time, reduce the computational burden to analyze data, and consequently attain high detection rate.


Energies ◽  
2021 ◽  
Vol 14 (4) ◽  
pp. 1055
Author(s):  
Qian Sun ◽  
William Ampomah ◽  
Junyu You ◽  
Martha Cather ◽  
Robert Balch

Machine-learning technologies have exhibited robust competences in solving many petroleum engineering problems. The accurate predictivity and fast computational speed enable a large volume of time-consuming engineering processes such as history-matching and field development optimization. The Southwest Regional Partnership on Carbon Sequestration (SWP) project desires rigorous history-matching and multi-objective optimization processes, which fits the superiorities of the machine-learning approaches. Although the machine-learning proxy models are trained and validated before imposing to solve practical problems, the error margin would essentially introduce uncertainties to the results. In this paper, a hybrid numerical machine-learning workflow solving various optimization problems is presented. By coupling the expert machine-learning proxies with a global optimizer, the workflow successfully solves the history-matching and CO2 water alternative gas (WAG) design problem with low computational overheads. The history-matching work considers the heterogeneities of multiphase relative characteristics, and the CO2-WAG injection design takes multiple techno-economic objective functions into accounts. This work trained an expert response surface, a support vector machine, and a multi-layer neural network as proxy models to effectively learn the high-dimensional nonlinear data structure. The proposed workflow suggests revisiting the high-fidelity numerical simulator for validation purposes. The experience gained from this work would provide valuable guiding insights to similar CO2 enhanced oil recovery (EOR) projects.


Deriving the methodologies to detect heart issues at an earlier stage and intimating the patient to improve their health. To resolve this problem, we will use Machine Learning techniques to predict the incidence at an earlier stage. We have a tendency to use sure parameters like age, sex, height, weight, case history, smoking and alcohol consumption and test like pressure ,cholesterol, diabetes, ECG, ECHO for prediction. In machine learning there are many algorithms which will be used to solve this issue. The algorithms include K-Nearest Neighbour, Support vector classifier, decision tree classifier, logistic regression and Random Forest classifier. Using these parameters and algorithms we need to predict whether or not the patient has heart disease or not and recommend the patient to improve his/her health.


Sign in / Sign up

Export Citation Format

Share Document