A Comparison of Machine Learning Approaches for Prediction of Permeability using Well Log Data in the Hydrocarbon Reservoirs

Mohsen Talebkeikhah; Zahra Sadeghtabaghi; Mehdi Shabani

doi:10.28991/hef-2021-02-02-01

A Comparison of Machine Learning Approaches for Prediction of Permeability using Well Log Data in the Hydrocarbon Reservoirs

Journal of Human, Earth, and Future ◽

10.28991/hef-2021-02-02-01 ◽

2021 ◽

Vol 2 (2) ◽

pp. 82-99

Author(s):

Mohsen Talebkeikhah ◽

Zahra Sadeghtabaghi ◽

Mehdi Shabani

Keyword(s):

Neural Network ◽

Random Forest ◽

Gamma Ray ◽

Accurate Determination ◽

Coefficient Of Determination ◽

Support Vector ◽

Learning Approaches ◽

Statistical Parameters ◽

Southwest Of Iran ◽

Gamma Ray Log

Permeability is a vital parameter in reservoir engineering that affects production directly. Since this parameter's significance is obvious, finding a way for accurate determination of permeability is essential as well. In this paper, the permeability of two notable carbonate reservoirs (Ilam and Sarvak) in the southwest of Iran was predicted by several different methods, and the level of accuracy in all models was compared. For this purpose, Multi-Layer Perceptron Neural Network (MLP), Radial Basis Function Neural Network (RBF), Support Vector Regression (SVR), decision tree (DT), and random forest (RF) methods were chosen. The full set of real well-logging data was investigated by random forest, and five of them were selected as the potent variables. Depth, Computed gamma-ray log (CGR), Spectral gamma-ray log (SGR), Neutron porosity log (NPHI), and density log (RHOB) were considered efficacious variables and used as input data, while permeability was considered output. It should be noted that permeability values are derived from core analysis. Statistical parameters like the coefficient of determination ( ), root mean square error (RMSE) and standard deviation (SD) were determined for the train, test, and total sets. Based on statistical and graphical results, the SVM and DT models perform more accurately than others. RMSE, SD and R2values of SVM and DT models are 0.38, 1.63, 0.97 and 0.44, 2.89, and 0.96 respectively. The results of the best-proposed models of this paper were then compared with the outcome of the empirical equation for permeability prediction. The comparison indicates that artificial intelligence methods perform more accurately than traditional methods for permeability estimation, such as proposed equations. Doi: 10.28991/HEF-2021-02-02-01 Full Text: PDF

Download Full-text

Detection of Malicious Software by Analyzing Distinct Artifacts Using Machine Learning and Deep Learning Algorithms

Electronics ◽

10.3390/electronics10141694 ◽

2021 ◽

Vol 10 (14) ◽

pp. 1694

Author(s):

Mathew Ashik ◽

A. Jyothish ◽

S. Anandaram ◽

P. Vinod ◽

Francesco Mercaldo ◽

...

Keyword(s):

Neural Network ◽

Machine Learning ◽

Deep Learning ◽

Support Vector ◽

Malware Analysis ◽

Learning Approaches ◽

Dynamic Features ◽

System Calls ◽

Prevention Methods ◽

Structural Aspects

Malware is one of the most significant threats in today’s computing world since the number of websites distributing malware is increasing at a rapid rate. Malware analysis and prevention methods are increasingly becoming necessary for computer systems connected to the Internet. This software exploits the system’s vulnerabilities to steal valuable information without the user’s knowledge, and stealthily send it to remote servers controlled by attackers. Traditionally, anti-malware products use signatures for detecting known malware. However, the signature-based method does not scale in detecting obfuscated and packed malware. Considering that the cause of a problem is often best understood by studying the structural aspects of a program like the mnemonics, instruction opcode, API Call, etc. In this paper, we investigate the relevance of the features of unpacked malicious and benign executables like mnemonics, instruction opcodes, and API to identify a feature that classifies the executable. Prominent features are extracted using Minimum Redundancy and Maximum Relevance (mRMR) and Analysis of Variance (ANOVA). Experiments were conducted on four datasets using machine learning and deep learning approaches such as Support Vector Machine (SVM), Naïve Bayes, J48, Random Forest (RF), and XGBoost. In addition, we also evaluate the performance of the collection of deep neural networks like Deep Dense network, One-Dimensional Convolutional Neural Network (1D-CNN), and CNN-LSTM in classifying unknown samples, and we observed promising results using APIs and system calls. On combining APIs/system calls with static features, a marginal performance improvement was attained comparing models trained only on dynamic features. Moreover, to improve accuracy, we implemented our solution using distinct deep learning methods and demonstrated a fine-tuned deep neural network that resulted in an F1-score of 99.1% and 98.48% on Dataset-2 and Dataset-3, respectively.

Download Full-text

Analysis of the Nosema Cells Identification for Microscopic Images

Sensors ◽

10.3390/s21093068 ◽

2021 ◽

Vol 21 (9) ◽

pp. 3068

Author(s):

Soumaya Dghim ◽

Carlos M. Travieso-González ◽

Radim Burget

Keyword(s):

Neural Network ◽

Machine Learning ◽

Image Processing ◽

Deep Learning ◽

The Other ◽

Support Vector ◽

Learning Approaches ◽

Microscopic Images ◽

Trained Neural Network ◽

Nosema Disease

The use of image processing tools, machine learning, and deep learning approaches has become very useful and robust in recent years. This paper introduces the detection of the Nosema disease, which is considered to be one of the most economically significant diseases today. This work shows a solution for recognizing and identifying Nosema cells between the other existing objects in the microscopic image. Two main strategies are examined. The first strategy uses image processing tools to extract the most valuable information and features from the dataset of microscopic images. Then, machine learning methods are applied, such as a neural network (ANN) and support vector machine (SVM) for detecting and classifying the Nosema disease cells. The second strategy explores deep learning and transfers learning. Several approaches were examined, including a convolutional neural network (CNN) classifier and several methods of transfer learning (AlexNet, VGG-16 and VGG-19), which were fine-tuned and applied to the object sub-images in order to identify the Nosema images from the other object images. The best accuracy was reached by the VGG-16 pre-trained neural network with 96.25%.

Download Full-text

Possibility of Autonomous Estimation of Shiba Goat’s Estrus and Non-Estrus Behavior by Machine Learning Methods

Animals ◽

10.3390/ani10050771 ◽

2020 ◽

Vol 10 (5) ◽

pp. 771

Author(s):

Toshiya Arakawa

Keyword(s):

Neural Network ◽

Machine Learning ◽

Random Forest ◽

Markov Models ◽

Tracking System ◽

Video Tracking ◽

Training Data ◽

Support Vector ◽

Learning Methods ◽

Machine Learning Methods

Mammalian behavior is typically monitored by observation. However, direct observation requires a substantial amount of effort and time, if the number of mammals to be observed is sufficiently large or if the observation is conducted for a prolonged period. In this study, machine learning methods as hidden Markov models (HMMs), random forests, support vector machines (SVMs), and neural networks, were applied to detect and estimate whether a goat is in estrus based on the goat’s behavior; thus, the adequacy of the method was verified. Goat’s tracking data was obtained using a video tracking system and used to estimate whether they, which are in “estrus” or “non-estrus”, were in either states: “approaching the male”, or “standing near the male”. Totally, the PC of random forest seems to be the highest. However, The percentage concordance (PC) value besides the goats whose data were used for training data sets is relatively low. It is suggested that random forest tend to over-fit to training data. Besides random forest, the PC of HMMs and SVMs is high. However, considering the calculation time and HMM’s advantage in that it is a time series model, HMM is better method. The PC of neural network is totally low, however, if the more goat’s data were acquired, neural network would be an adequate method for estimation.

Download Full-text

Random Forest Regressor-Based Approach for Detecting Fault Location and Duration in Power Systems

Sensors ◽

10.3390/s22020458 ◽

2022 ◽

Vol 22 (2) ◽

pp. 458

Author(s):

Zakaria El Mrabet ◽

Niroop Sugunaraj ◽

Prakash Ranganathan ◽

Shrirang Abhyankar

Keyword(s):

Neural Network ◽

Random Forest ◽

Power Systems ◽

Real Time ◽

Fault Location ◽

State Of The Art ◽

Economic Consequences ◽

Support Vector ◽

Detection Accuracy ◽

Data Driven Approach

Power system failures or outages due to short-circuits or “faults” can result in long service interruptions leading to significant socio-economic consequences. It is critical for electrical utilities to quickly ascertain fault characteristics, including location, type, and duration, to reduce the service time of an outage. Existing fault detection mechanisms (relays and digital fault recorders) are slow to communicate the fault characteristics upstream to the substations and control centers for action to be taken quickly. Fortunately, due to availability of high-resolution phasor measurement units (PMUs), more event-driven solutions can be captured in real time. In this paper, we propose a data-driven approach for determining fault characteristics using samples of fault trajectories. A random forest regressor (RFR)-based model is used to detect real-time fault location and its duration simultaneously. This model is based on combining multiple uncorrelated trees with state-of-the-art boosting and aggregating techniques in order to obtain robust generalizations and greater accuracy without overfitting or underfitting. Four cases were studied to evaluate the performance of RFR: 1. Detecting fault location (case 1), 2. Predicting fault duration (case 2), 3. Handling missing data (case 3), and 4. Identifying fault location and length in a real-time streaming environment (case 4). A comparative analysis was conducted between the RFR algorithm and state-of-the-art models, including deep neural network, Hoeffding tree, neural network, support vector machine, decision tree, naive Bayesian, and K-nearest neighborhood. Experiments revealed that RFR consistently outperformed the other models in detection accuracy, prediction error, and processing time.

Download Full-text

Comparison of support vector machine, random forest and neural network classifiers for tree species classification on airborne hyperspectral APEX images

European Journal of Remote Sensing ◽

10.1080/22797254.2017.1299557 ◽

2017 ◽

Vol 50 (1) ◽

pp. 144-154 ◽

Cited By ~ 95

Author(s):

Edwin Raczko ◽

Bogdan Zagajewski

Keyword(s):

Neural Network ◽

Support Vector Machine ◽

Random Forest ◽

Tree Species ◽

Support Vector ◽

Species Classification ◽

Tree Species Classification ◽

Neural Network Classifiers

Download Full-text

A Machine Learning View on Momentum and Reversal Trading

Algorithms ◽

10.3390/a11110170 ◽

2018 ◽

Vol 11 (11) ◽

pp. 170 ◽

Cited By ~ 2

Author(s):

Zhixi Li ◽

Vincent Tam

Keyword(s):

Neural Network ◽

Machine Learning ◽

Stock Market ◽

Short Term Memory ◽

Predictive Ability ◽

Trading Strategies ◽

Machine Learning Techniques ◽

Support Vector ◽

Learning Approaches ◽

Learning Techniques

Momentum and reversal effects are important phenomena in stock markets. In academia, relevant studies have been conducted for years. Researchers have attempted to analyze these phenomena using statistical methods and to give some plausible explanations. However, those explanations are sometimes unconvincing. Furthermore, it is very difficult to transfer the findings of these studies to real-world investment trading strategies due to the lack of predictive ability. This paper represents the first attempt to adopt machine learning techniques for investigating the momentum and reversal effects occurring in any stock market. In the study, various machine learning techniques, including the Decision Tree (DT), Support Vector Machine (SVM), Multilayer Perceptron Neural Network (MLP), and Long Short-Term Memory Neural Network (LSTM) were explored and compared carefully. Several models built on these machine learning approaches were used to predict the momentum or reversal effect on the stock market of mainland China, thus allowing investors to build corresponding trading strategies. The experimental results demonstrated that these machine learning approaches, especially the SVM, are beneficial for capturing the relevant momentum and reversal effects, and possibly building profitable trading strategies. Moreover, we propose the corresponding trading strategies in terms of market states to acquire the best investment returns.

Download Full-text

Brain Signal Classification Based on Deep CNN

International Journal of Security and Privacy in Pervasive Computing ◽

10.4018/ijsppc.2020040102 ◽

2020 ◽

Vol 12 (2) ◽

pp. 17-29

Author(s):

Terry Gao ◽

Grace Ying Wang

Keyword(s):

Neural Network ◽

Machine Learning ◽

Mental Status ◽

Machine Learning Techniques ◽

Support Vector ◽

Imaging Features ◽

Learning Approaches ◽

Computationally Efficient ◽

Set Up ◽

Brain Data

It is essential to increase the accuracy and robustness of classification of brain data, including EEG, in order to facilitate a direct communication between the human brain and computerized devices. Different machine learning approaches, such as support vector machine (SVM), neural network, and linear discrimination analysis (LDA), have been applied to set up automatic subjective-classifier, and the findings for their capacities in this regard have been inconclusive. The present study developed an effective classifier for human mental status using deep learning in a convolutional neural network. In contrast to most previous studies commonly using EEG waveform or numeric value of brain signals for classification, the authors utilised imaging features generated from EEG data at alpha frequency band. A new model proposed in this study provides a simple and computationally efficient approach to distinguish mental status during resting. With training, this model could predict new 2D EEG images with above 90% accuracy, while traditional machine learning techniques failed to achieve this accuracy.

Download Full-text

Support Vector Machine and Neural Network based Model for Monthly Stream Flow Forecasting

International Journal of Engineering & Technology ◽

10.14419/ijet.v7i4.35.23089 ◽

2018 ◽

Vol 7 (4.35) ◽

pp. 683

Author(s):

Nuratiah Zaini ◽

Marlinda Abdul Malek ◽

Marina Yusoff ◽

Siti Fatimah Che Osmi ◽

Nurul Hani Mardi ◽

...

Keyword(s):

Neural Network ◽

Support Vector Machine ◽

Hybrid Model ◽

Peninsular Malaysia ◽

Coefficient Of Determination ◽

Support Vector ◽

Design Development ◽

Streamflow Forecasting ◽

Monthly Streamflow ◽

Bpnn Model

Accurate forecasting of streamflow is desired in many water resources planning and management, flood prevention and design development. In this study, the accuracy of two hybrid model, support vector machine - particle swarm optimization (SVM-PSO) and bat algorithm – backpropagation neural network (BA-BPNN) for monthly streamflow forecasting at Kuantan River located in Peninsular Malaysia are investigated and compared to regular SVM and BPNN model. Heuristic optimization namely PSO and BA are introduced to find the optimum SVM and BPNN parameters. The input parameters to the forecasting models are antecedent streamflow, historical rainfall and meteorological parameters namely evaporation, temperature, relative humidity and mean wind speed. Two performance evaluation measure, root mean square error (RMSE) and coefficient of determination (R2) were employed to evaluate the performance of developed forecasting model. It is found that, RMSE and R2 for hybrid SVM-PSO are 24.8267 m3/s and 0.9651 respectively while general SVM model yields RMSE of 27.5086 m3/s and 0.9305 of R2 for testing phase. Besides that, hybrid BA-BPNN produces RMSE, 17.7579 m3/s and R2, 0.7740 while BPNN model produces lower RMSE and R2 of 28.1396 m3/s and 0.5015 respectively. Therefore, the results indicate that hybrid model, SVM-PSO and Bat-BPNN yield better performance as compared to general SVM and BPNN, respectively in streamflow forecasting.

Download Full-text

Evaluation of Machine Learning Approaches to Predict Soil Organic Matter and pH Using vis-NIR Spectra

Sensors ◽

10.3390/s19020263 ◽

2019 ◽

Vol 19 (2) ◽

pp. 263 ◽

Cited By ~ 16

Author(s):

Meihua Yang ◽

Dongyun Xu ◽

Songchao Chen ◽

Hongyi Li ◽

Zhou Shi

Keyword(s):

Machine Learning ◽

Organic Matter ◽

Soil Organic Matter ◽

Least Squares ◽

Paddy Soil ◽

Prediction Accuracy ◽

Accurate Determination ◽

Support Vector ◽

Learning Approaches ◽

Lower Yangtze

Soil organic matter (SOM) and pH are essential soil fertility indictors of paddy soil in the middle-lower Yangtze Plain. Rapid, non-destructive and accurate determination of SOM and pH is vital to preventing soil degradation caused by inappropriate land management practices. Visible-near infrared (vis-NIR) spectroscopy with multivariate calibration can be used to effectively estimate soil properties. In this study, 523 soil samples were collected from paddy fields in the Yangtze Plain, China. Four machine learning approaches—partial least squares regression (PLSR), least squares-support vector machines (LS-SVM), extreme learning machines (ELM) and the Cubist regression model (Cubist)—were used to compare the prediction accuracy based on vis-NIR full bands and bands reduced using the genetic algorithm (GA). The coefficient of determination (R2), root mean square error (RMSE), and ratio of performance to inter-quartile distance (RPIQ) were used to assess the prediction accuracy. The ELM with GA reduced bands was the best model for SOM (SOM: R2 = 0.81, RMSE = 5.17, RPIQ = 2.87) and pH (R2 = 0.76, RMSE = 0.43, RPIQ = 2.15). The performance of the LS-SVM for pH prediction did not differ significantly between the model with GA (R2 = 0.75, RMSE = 0.44, RPIQ = 2.08) and without GA (R2 = 0.74, RMSE = 0.45, RPIQ = 2.07). Although a slight increase was observed when ELM were used for prediction of SOM and pH using reduced bands (SOM: R2 = 0.81, RMSE = 5.17, RPIQ = 2.87; pH: R2 = 0.76, RMSE = 0.43, RPIQ = 2.15) compared with full bands (R2 = 0.81, RMSE = 5.18, RPIQ = 2.83; pH: R2 = 0.76, RMSE = 0.45, RPIQ = 2.07), the number of wavelengths was greatly reduced (SOM: 201 to 44; pH: 201 to 32). Thus, the ELM coupled with reduced bands by GA is recommended for prediction of properties of paddy soil (SOM and pH) in the middle-lower Yangtze Plain.

Download Full-text

Comparison of random forest, support vector machine and back propagation neural network for electronic tongue data classification: Application to the recognition of orange beverage and Chinese vinegar

Sensors and Actuators B Chemical ◽

10.1016/j.snb.2012.11.071 ◽

2013 ◽

Vol 177 ◽

pp. 970-980 ◽

Cited By ~ 151

Author(s):

Miao Liu ◽

Mingjun Wang ◽

Jun Wang ◽

Duo Li

Keyword(s):

Neural Network ◽

Support Vector Machine ◽

Random Forest ◽

Back Propagation ◽

Electronic Tongue ◽

Data Classification ◽

Back Propagation Neural Network ◽

Support Vector ◽

Chinese Vinegar

Download Full-text