Successful Case Study of Machine Learning Application to Streamline and Improve History Matching Process for Complex Gas-Condensate Reservoirs in Hai Thach Field, Offshore Vietnam

Abstract This paper reports a successful case study of applying machine learning to improve the history matching process, making it easier, less time-consuming, and more accurate, by determining whether Local Grid Refinement (LGR) with transmissibility multiplier is needed to history match gas-condensate wells producing from geologically complex reservoirs as well as determining the required LGR setup to history match those gas-condensate producers. History matching Hai Thach gas-condensate production wells is extremely challenging due to the combined effect of condensate banking, sub-seismic fault network, complex reservoir distribution and connectivity, uncertain HIIP, and lack of PVT data for most reservoirs. In fact, for some wells, many trial simulation runs were conducted before it became clear that LGR with transmissibility multiplier was required to obtain good history matching. In order to minimize this time-consuming trial-and-error process, machine learning was applied in this study to analyze production data using synthetic samples generated by a very large number of compositional sector models so that the need for LGR could be identified before the history matching process begins. Furthermore, machine learning application could also determine the required LGR setup. The method helped provide better models in a much shorter time, and greatly improved the efficiency and reliability of the dynamic modeling process. More than 500 synthetic samples were generated using compositional sector models and divided into separate training and test sets. Multiple classification algorithms such as logistic regression, Gaussian Naive Bayes, Bernoulli Naive Bayes, multinomial Naive Bayes, linear discriminant analysis, support vector machine, K-nearest neighbors, and Decision Tree as well as artificial neural networks were applied to predict whether LGR was used in the sector models. The best algorithm was found to be the Decision Tree classifier, with 100% accuracy on the training set and 99% accuracy on the test set. The LGR setup (size of LGR area and range of transmissibility multiplier) was also predicted best by the Decision Tree classifier with 91% accuracy on the training set and 88% accuracy on the test set. The machine learning model was validated using actual production data and the dynamic models of history-matched wells. Finally, using the machine learning prediction on wells with poor history matching results, their dynamic models were updated and significantly improved.

Download Full-text

Successful Application of Machine Learning to Improve Dynamic Modeling and History Matching for Complex Gas-Condensate Reservoirs in Hai Thach Field, Nam Con Son Basin, Offshore Vietnam

10.2118/208657-ms ◽

2021 ◽

Author(s):

Son K. Hoang ◽

Tung V. Tran ◽

Tan N. Nguyen ◽

Tu A. Truong ◽

Duy H. Pham ◽

...

Keyword(s):

Machine Learning ◽

Decision Tree ◽

Dynamic Modeling ◽

History Matching ◽

Dynamic Models ◽

Gas Condensate ◽

Support Vector ◽

History Match ◽

Tree Classifier ◽

Test Sets

Abstract This study aims to apply machine learning (ML) to make history matching (HM) process easier, faster, more accurate, and more reliable by determining whether Local Grid Refinement (LGR) with transmissibility multiplier is needed to history match gas-condensate wells producing from geologically complex reservoirs and determining how LGR should be set up to successfully history match those production wells. The main challenges for HM gas-condensate production from Hai Thach wells are large effect of condensate banking (condensate blockage), flow baffles by the sub-seismic fault network, complex reservoir distribution and connectivity, highly uncertain HIIP, and lack of PVT information for most reservoirs. In this study, ML was applied to analyze production data using synthetic samples generated by a very large number of compositional sector models so that the need for LGR could be identified before the HM process and the required LGR setup could also be determined. The proposed method helped provide better models in a much shorter time, and improved the efficiency and reliability of the dynamic modeling process. 500+ synthetic samples were generated using compositional sector models and divided into training and test sets. Supervised classification algorithms including logistic regression, Gaussian, Bernoulli, and multinomial Naïve Bayes, linear discriminant analysis, support vector machine, K-nearest neighbors, and Decision Tree as well as ANN were applied to the data sets to determine the need for using LGR in HM. The best algorithm was found to be the Decision Tree classifier, with 100% and 99% accuracy on the training and the test sets, respectively. The size of the LGR area could also be determined reasonably well at 89% and 87% accuracy on the training and the test sets, respectively. The range of the transmissibility multiplier could also be determined reasonably well at 97% and 91% accuracy on the training and the test sets, respectively. Moreover, the ML model was validated using actual production and HM data. A new method of applying ML in dynamic modeling and HM of challenging gas-condensate wells in geologically complex reservoirs has been successfully applied to the high-pressure high-temperature Hai Thach field offshore Vietnam. The proposed method helped reduce many trial and error simulation runs and provide better and more reliable dynamic models.

Download Full-text

Naive Bayes and Decision Tree Classifier for Streaming Data Using HBase

Advances in Intelligent Systems and Computing - Advanced Computing and Systems for Security ◽

10.1007/978-981-13-3250-0_8 ◽

2018 ◽

pp. 105-116

Author(s):

Aradhita Mukherjee ◽

Sudip Mondal ◽

Nabendu Chaki ◽

Sunirmal Khatua

Keyword(s):

Decision Tree ◽

Naive Bayes ◽

Naïve Bayes ◽

Streaming Data ◽

Decision Tree Classifier ◽

Tree Classifier

Download Full-text

Swindling Shonky Anatomization of Credit Card Transactions using Machine Learning

International Journal of Recent Technology and Engineering - 2 ◽

10.35940/ijrte.d7621.118419 ◽

2019 ◽

Vol 8 (4) ◽

pp. 1477-1483

Keyword(s):

Machine Learning ◽

Logistic Regression ◽

Random Forest ◽

Decision Tree ◽

Credit Card ◽

Naive Bayes ◽

Gradient Boosting ◽

Decision Tree Classifier ◽

Tree Classifier ◽

Feature Importance

With the fast moving technological advancement, the internet usage has been increased rapidly in all the fields. The money transactions for all the applications like online shopping, banking transactions, bill settlement in any industries, online ticket booking for travel and hotels, Fees payment for educational organization, Payment for treatment to hospitals, Payment for super market and variety of applications are using online credit card transactions. This leads to the fraud usage of other accounts and transaction that result in the loss of service and profit to the institution. With this background, this paper focuses on predicting the fraudulent credit card transaction. The Credit Card Transaction dataset from KAGGLE machine learning Repository is used for prediction analysis. The analysis of fraudulent credit card transaction is achieved in four ways. Firstly, the relationship between the variables of the dataset is identified and represented by the graphical notations. Secondly, the feature importance of the dataset is identified using Random Forest, Ada boost, Logistic Regression, Decision Tree, Extra Tree, Gradient Boosting and Naive Bayes classifiers. Thirdly, the extracted feature importance if the credit card transaction dataset is fitted to Random Forest classifier, Ada boost classifier, Logistic Regression classifier, Decision Tree classifier, Extra Tree classifier, Gradient Boosting classifier and Naive Bayes classifier. Fourth, the Performance Analysis is done by analyzing the performance metrics like Accuracy, FScore, AUC Score, Precision and Recall. The implementation is done by python in Anaconda Spyder Navigator Integrated Development Environment. Experimental Results shows that the Decision Tree classifier have achieved the effective prediction with the precision of 1.0, recall of 1.0, FScore of 1.0 , AUC Score of 89.09 and Accuracy of 99.92%.

Download Full-text

Breast Cancer Prediction Using Classification Techniques of Machine Learning

International Journal for Research in Applied Science and Engineering Technology ◽

10.22214/ijraset.2022.39743 ◽

2022 ◽

Vol 10 (1) ◽

pp. 51-57

Author(s):

Angela More

Keyword(s):

Breast Cancer ◽

Machine Learning ◽

Naive Bayes ◽

Naïve Bayes ◽

Machine Learning Algorithms ◽

Machine Learning Techniques ◽

Decision Tree Classifier ◽

Learning Techniques ◽

Tree Classifier ◽

Abstract Data

Abstract: Data analytics play vital roles in diagnosis and treatment in the health care sector. To enable practitioner decisionmaking, huge volumes of data should be processed with machine learning techniques to produce tools for prediction and classification Breast Cancer reports 1 million cases per year. We have proposed a prediction model, which is specifically designed for prediction of Breast Cancer using Machine learning algorithms Decision tree classifier, Naïve Bayes, SVM and KNearest Neighbour algorithms. The model predicts the type of tumour, the tumour can be benign (noncancerous) or malignant (cancerous) . The model uses supervised learning which is a machine learning concept where we provide dependent and independent columns to machine. It uses classification technique which predicts the type of tumour. Keywords: Cancer, Machine learning, Prediction, Data Visualization, SVM, Naïve Bayes, Classification.

Download Full-text

Analog the Performance between Three Classifiers on Bank Marketing Data

International Journal of Recent Technology and Engineering - 2 ◽

10.35940/ijrte.b1066.0782s319 ◽

2019 ◽

Vol 8 (2S3) ◽

pp. 382-386

Keyword(s):

Decision Tree ◽

Naive Bayes ◽

Naïve Bayes ◽

Performance Measure ◽

Classification Methods ◽

Decision Tree Classifier ◽

Learning Classifier ◽

Tree Classifier ◽

Marketing Data ◽

Bank Marketing

There are several different classification methods can be used to do the classification which can classified the data into specified groups or classes. This paper presents a comparison of performance between three classifiers which include Naïve Bayes, Decision Tree and Neural Network on Bank Marketing dataset. This study focus on which classifier will have the better performance based on some performance measure in two different datasets. The result shows that machine learning classifier was not able compare to Naïve Bayes and Decision Tree classifier. Based on the results, the huge dataset obtained the more information which can be predict accurately and identify the performance of the classifier correctly.

Download Full-text

Voting Based Classification Method for Diabetes Prediction

International Journal of Recent Technology and Engineering - 2 ◽

10.35940/ijrte.b1172.0782s619 ◽

2019 ◽

Vol 8 (2S6) ◽

pp. 913-918

Keyword(s):

Support Vector Machine ◽

Naive Bayes ◽

Research Work ◽

Prediction Method ◽

Naïve Bayes ◽

Support Vector ◽

Decision Tree Classifier ◽

Prediction Analysis ◽

Tree Classifier ◽

Diabetes Prediction

This research work is based on the diabetes prediction analysis. The prediction analysis technique has the three steps which are dataset input, feature extraction and classification. In this previous system, the Support Vector Machine and naïve bayes are applied for the diabetes prediction. In this research work, voting based method is applied for the diabetes prediction. The voting based method is the ensemble based which is applied for the diabetes prediction method. In the voting method, three classifiers are applied which are Support Vector Machine, naïve bayes and decision tree classifier. The existing and proposed methods are implemented in python and results in terms of accuracy, precision-recall and execution time. It is analyzed that voting based method give high performance as compared to other classifiers.

Download Full-text

Improved argumentative paragraphs detection in academic theses supported with unit segmentation

Journal of Intelligent & Fuzzy Systems ◽

10.3233/jifs-219237 ◽

2021 ◽

pp. 1-11

Author(s):

Jesús Miguel García-Gorrostieta ◽

Aurelio López-López ◽

Samuel González-López ◽

Adrián Pastor López-Monroy

Keyword(s):

Machine Learning ◽

Decision Tree ◽

Automatic Detection ◽

Machine Learning Techniques ◽

Svm Classifier ◽

Complex Task ◽

Decision Tree Classifier ◽

Learning Techniques ◽

Tree Classifier ◽

Academic Author

Academic theses writing is a complex task that requires the author to be skilled in argumentation. The goal of the academic author is to communicate clear ideas and to convince the reader of the presented claims. However, few students are good arguers, and this is a skill that takes time to master. In this paper, we present an exploration of lexical features used to model automatic detection of argumentative paragraphs using machine learning techniques. We present a novel proposal, which combines the information in the complete paragraph with the detection of argumentative segments in order to achieve improved results for the detection of argumentative paragraphs. We propose two approaches; a more descriptive one, which uses the decision tree classifier with indicators and lexical features; and another more efficient, which uses an SVM classifier with lexical features and a Document Occurrence Representation (DOR). Both approaches consider the detection of argumentative segments to ensure that a paragraph detected as argumentative has indeed segments with argumentation. We achieved encouraging results for both approaches.

Download Full-text

An Adaptive Multi-Layer Botnet Detection Technique Using Machine Learning Classifiers

Applied Sciences ◽

10.3390/app9112375 ◽

2019 ◽

Vol 9 (11) ◽

pp. 2375 ◽

Cited By ~ 7

Author(s):

Riaz Ullah Khan ◽

Xiaosong Zhang ◽

Rajesh Kumar ◽

Abubakar Sharif ◽

Noorbakhsh Amiri Golilarz ◽

...

Keyword(s):

Machine Learning ◽

Decision Tree ◽

Network Traffic ◽

Traffic Classification ◽

Decision Tree Classifier ◽

Machine Learning Classifiers ◽

Learning Classifiers ◽

Average Accuracy ◽

Final Layer ◽

Tree Classifier

In recent years, the botnets have been the most common threats to network security since it exploits multiple malicious codes like a worm, Trojans, Rootkit, etc. The botnets have been used to carry phishing links, to perform attacks and provide malicious services on the internet. It is challenging to identify Peer-to-peer (P2P) botnets as compared to Internet Relay Chat (IRC), Hypertext Transfer Protocol (HTTP) and other types of botnets because P2P traffic has typical features of the centralization and distribution. To resolve the issues of P2P botnet identification, we propose an effective multi-layer traffic classification method by applying machine learning classifiers on features of network traffic. Our work presents a framework based on decision trees which effectively detects P2P botnets. A decision tree algorithm is applied for feature selection to extract the most relevant features and ignore the irrelevant features. At the first layer, we filter non-P2P packets to reduce the amount of network traffic through well-known ports, Domain Name System (DNS). query, and flow counting. The second layer further characterized the captured network traffic into non-P2P and P2P. At the third layer of our model, we reduced the features which may marginally affect the classification. At the final layer, we successfully detected P2P botnets using decision tree Classifier by extracting network communication features. Furthermore, our experimental evaluations show the significance of the proposed method in P2P botnets detection and demonstrate an average accuracy of 98.7%.

Download Full-text

Future Prediction of Diabetics using XG Booster Classifiers

International Journal of Engineering and Advanced Technology - Regular Issue ◽

10.35940/ijeat.c5144.029320 ◽

2020 ◽

Vol 9 (3) ◽

pp. 2128-2132

Keyword(s):

Machine Learning ◽

Decision Tree ◽

Naive Bayes ◽

Naïve Bayes ◽

The Body ◽

Machine Learning Algorithms ◽

Support Vector ◽

Common Disease ◽

Data Set ◽

Glucose Content

Diabetes is a most common disease that occurs to most of the humans now a day. The predictions for this disease are proposed through machine learning techniques. Through this method the risk factors of this disease are identified and can be prevented from increasing. Early prediction in such disease can be controlled and save human’s life. For the early predictions of this disease we collect data set having 8 attributes diabetic of 200 patients. The patients’ sugar level in the body is tested by the features of patient’s glucose content in the body and according to the age. The main Machine learning algorithms are Support vector machine (SVM), naive bayes (NB), K nearest neighbor (KNN) and Decision Tree (DT). In the exiting the Naive Bayes the accuracy levels are 66% but in the Decision tree the accuracy levels are 70 to 71%. The accuracy levels of the patients are not proper in range. But in XG boost classifiers even after the Naïve Bayes 74 Percentage and in Decision tree the accuracy levels are 89 to 90%. In the proposed system the accuracy ranges are shown properly and this is only used mostly. A dataset of 729 patients can be stored in Mongo DB and in that 129 patients repots are taken for the prediction purpose and the remaining are used for training. The training datasets are used for the prediction purposes.

Download Full-text

Finding Donors for CharityML using Machine Learning

International Journal for Research in Applied Science and Engineering Technology ◽

10.22214/ijraset.2021.36048 ◽

2021 ◽

Vol 9 (VI) ◽

pp. 4988-4996

Author(s):

P. Chandra Sandeep

Keyword(s):

Machine Learning ◽

Decision Tree ◽

Naive Bayes ◽

Scaling Up ◽

Naïve Bayes ◽

Initial Values ◽

Type Form

CharityML is a fictional non-earnings company created for the only motive of the usage of for this project. Many non-earnings groups try at the donations they get hold of and specifically they need to be very choosy in whom to reach for the donations. In our project, we used numerous supervised algorithms of our concern to as it should be model the individuals' profits with the usage of records accumulated from the 1994 U.S. Census. You will then select the first-rate set of rules from the initial values and then by using the initial values optimize this set of rules for better prediction. Your purpose with this implementation is to assemble a version that asit should be predicts whether or not a man or woman makes extra than 50,000 dollars. This type form undertakings are going to help in a non-earnings company setup, wherein groups live on donations. Understanding a character's profits can assist non-earnings company higher apprehend how huge of a grant to request, or whether or not no longer they need to attain out to start with. While it is able to be hard to decide a character's standard profits bracket form the known sources, we will infer this price from different publicly to be had features. The dataset for this assignment originates from the UCI Machine Learning Repository. The dataset become donated with the aid of using Ron Kohavi and Barry Becker, after being posted withinside the article "Scaling Up the Accuracy of Naive-Bayes Classifiers: A Decision-Tree Hybrid". The records we inspect right here includes few modifications to the raw dataset, which include disposing of the 'hgtre' attribute and information with lacking or ill-formatted fields.

Download Full-text