Sequential Prediction of Drilling Fluid Loss Using Support Vector Machine and Decision Tree Methods

2021 ◽  
Author(s):  
Oluwatosin John Rotimi ◽  
David Nnaemeka Ukwu ◽  
Wang Zhenli ◽  
Yao Liang ◽  
Anthony A. Ameloko ◽  
...  

Abstract Machine learning methods have been applied to predict depths of fluid loss in hydrocarbon exploration.During drilling, lost circulation can be described as the unpleasant loss of all or part of drilling mud or fluid into the immediate formations or affected formation by excessive hydrostatic pressure, sufficient to fracture the formation or expand existing fractures encountered during the drilling process. In this study, we deployed Python codes of Support Vector Machine (SVM) and Decision Tree (DT) methodsto categorical data obtained from drilling operations in a producing field to predict lost circulation occurrence. The modelsleveraged the capability of both SVM and DT to achieve binary classification by adopting flow-out percentage of less than 70 percent as the points of lost circulation. That is, < 70% is represented as Loss and > 70% represented asNo Loss. Prediction models were applied to 10 input variables preprocessed with principal component analysis (PCA) to reduce dimensionality and focus on essential variables. The preprocessed SVM model gave an improved result while preprocessing does not affect DT models. Overall, DT models predicted accurate fluid losszones and can be scaled up to field operations with options ofcontinuous sampled variables.

2020 ◽  
Vol 6 ◽  
pp. e275
Author(s):  
Binti Solihah ◽  
Azhari Azhari ◽  
Aina Musdholifah

Background A conformational B-cell epitope is one of the main components of vaccine design. It contains separate segments in its sequence, which are spatially close in the antigen chain. The availability of Ag-Ab complex data on the Protein Data Bank allows for the development predictive methods. Several epitope prediction models also have been developed, including learning-based methods. However, the performance of the model is still not optimum. The main problem in learning-based prediction models is class imbalance. Methods This study proposes CluSMOTE, which is a combination of a cluster-based undersampling method and Synthetic Minority Oversampling Technique. The approach is used to generate other sample data to ensure that the dataset of the conformational epitope is balanced. The Hierarchical DBSCAN algorithm is performed to identify the cluster in the majority class. Some of the randomly selected data is taken from each cluster, considering the oversampling degree, and combined with the minority class data. The balance data is utilized as the training dataset to develop a conformational epitope prediction. Furthermore, two binary classification methods, Support Vector Machine and Decision Tree, are separately used to develop model prediction and to evaluate the performance of CluSMOTE in predicting conformational B-cell epitope. The experiment is focused on determining the best parameter for optimal CluSMOTE. Two independent datasets are used to compare the proposed prediction model with state of the art methods. The first and the second datasets represent the general protein and the glycoprotein antigens respectively. Result The experimental result shows that CluSMOTE Decision Tree outperformed the Support Vector Machine in terms of AUC and Gmean as performance measurements. The mean AUC of CluSMOTE Decision Tree in the Kringelum and the SEPPA 3 test sets are 0.83 and 0.766, respectively. This shows that CluSMOTE Decision Tree is better than other methods in the general protein antigen, though comparable with SEPPA 3 in the glycoprotein antigen.


2019 ◽  
Vol 15 (2) ◽  
pp. 275-280
Author(s):  
Agus Setiyono ◽  
Hilman F Pardede

It is now common for a cellphone to receive spam messages. Great number of received messages making it difficult for human to classify those messages to Spam or no Spam.  One way to overcome this problem is to use Data Mining for automatic classifications. In this paper, we investigate various data mining techniques, named Support Vector Machine, Multinomial Naïve Bayes and Decision Tree for automatic spam detection. Our experimental results show that Support Vector Machine algorithm is the best algorithm over three evaluated algorithms. Support Vector Machine achieves 98.33%, while Multinomial Naïve Bayes achieves 98.13% and Decision Tree is at 97.10 % accuracy.


Sign in / Sign up

Export Citation Format

Share Document