Machine learning approach to improve vapor recovery: Prediction and frequency converter with a new vapor recovery system

In practice, the volatile organic compounds (VOCs) pollution can exist when refueling due to the properties of the gasoline, low viscosity and high saturated-vapor pressure. A new gasoline vapor recovery system involving frequency conversion technology and machine learning is developed to cope with this problem. In the proposed system, firstly, the pumping capacity of the vacuum pump is evaluated, and test shows an almost linear relationship between suction volume and frequency. Then, the Multi-Layer Perception (MLP) neural network and the support vector regression (SVR) are employed to predict the gas-liquid ratio, and the numerical examples are presented to prove the high prediction accuracy of the MLP and SVR, respectively, where the MLP neural network has better generalization ability. Finally, compared with the two gasoline vapor recovery systems based on the 1: 1 fixed control model and the PID control model, respectively, the gasoline vapor recovery efficiency is improved significantly by the new gasoline vapor recovery system.

Download Full-text

Detection of Malicious Software by Analyzing Distinct Artifacts Using Machine Learning and Deep Learning Algorithms

Electronics ◽

10.3390/electronics10141694 ◽

2021 ◽

Vol 10 (14) ◽

pp. 1694

Author(s):

Mathew Ashik ◽

A. Jyothish ◽

S. Anandaram ◽

P. Vinod ◽

Francesco Mercaldo ◽

...

Keyword(s):

Neural Network ◽

Machine Learning ◽

Deep Learning ◽

Support Vector ◽

Malware Analysis ◽

Learning Approaches ◽

Dynamic Features ◽

System Calls ◽

Prevention Methods ◽

Structural Aspects

Malware is one of the most significant threats in today’s computing world since the number of websites distributing malware is increasing at a rapid rate. Malware analysis and prevention methods are increasingly becoming necessary for computer systems connected to the Internet. This software exploits the system’s vulnerabilities to steal valuable information without the user’s knowledge, and stealthily send it to remote servers controlled by attackers. Traditionally, anti-malware products use signatures for detecting known malware. However, the signature-based method does not scale in detecting obfuscated and packed malware. Considering that the cause of a problem is often best understood by studying the structural aspects of a program like the mnemonics, instruction opcode, API Call, etc. In this paper, we investigate the relevance of the features of unpacked malicious and benign executables like mnemonics, instruction opcodes, and API to identify a feature that classifies the executable. Prominent features are extracted using Minimum Redundancy and Maximum Relevance (mRMR) and Analysis of Variance (ANOVA). Experiments were conducted on four datasets using machine learning and deep learning approaches such as Support Vector Machine (SVM), Naïve Bayes, J48, Random Forest (RF), and XGBoost. In addition, we also evaluate the performance of the collection of deep neural networks like Deep Dense network, One-Dimensional Convolutional Neural Network (1D-CNN), and CNN-LSTM in classifying unknown samples, and we observed promising results using APIs and system calls. On combining APIs/system calls with static features, a marginal performance improvement was attained comparing models trained only on dynamic features. Moreover, to improve accuracy, we implemented our solution using distinct deep learning methods and demonstrated a fine-tuned deep neural network that resulted in an F1-score of 99.1% and 98.48% on Dataset-2 and Dataset-3, respectively.

Download Full-text

Analysis of the Nosema Cells Identification for Microscopic Images

Sensors ◽

10.3390/s21093068 ◽

2021 ◽

Vol 21 (9) ◽

pp. 3068

Author(s):

Soumaya Dghim ◽

Carlos M. Travieso-González ◽

Radim Burget

Keyword(s):

Neural Network ◽

Machine Learning ◽

Image Processing ◽

Deep Learning ◽

The Other ◽

Support Vector ◽

Learning Approaches ◽

Microscopic Images ◽

Trained Neural Network ◽

Nosema Disease

The use of image processing tools, machine learning, and deep learning approaches has become very useful and robust in recent years. This paper introduces the detection of the Nosema disease, which is considered to be one of the most economically significant diseases today. This work shows a solution for recognizing and identifying Nosema cells between the other existing objects in the microscopic image. Two main strategies are examined. The first strategy uses image processing tools to extract the most valuable information and features from the dataset of microscopic images. Then, machine learning methods are applied, such as a neural network (ANN) and support vector machine (SVM) for detecting and classifying the Nosema disease cells. The second strategy explores deep learning and transfers learning. Several approaches were examined, including a convolutional neural network (CNN) classifier and several methods of transfer learning (AlexNet, VGG-16 and VGG-19), which were fine-tuned and applied to the object sub-images in order to identify the Nosema images from the other object images. The best accuracy was reached by the VGG-16 pre-trained neural network with 96.25%.

Download Full-text

Possibility of Autonomous Estimation of Shiba Goat’s Estrus and Non-Estrus Behavior by Machine Learning Methods

Animals ◽

10.3390/ani10050771 ◽

2020 ◽

Vol 10 (5) ◽

pp. 771

Author(s):

Toshiya Arakawa

Keyword(s):

Neural Network ◽

Machine Learning ◽

Random Forest ◽

Markov Models ◽

Tracking System ◽

Video Tracking ◽

Training Data ◽

Support Vector ◽

Learning Methods ◽

Machine Learning Methods

Mammalian behavior is typically monitored by observation. However, direct observation requires a substantial amount of effort and time, if the number of mammals to be observed is sufficiently large or if the observation is conducted for a prolonged period. In this study, machine learning methods as hidden Markov models (HMMs), random forests, support vector machines (SVMs), and neural networks, were applied to detect and estimate whether a goat is in estrus based on the goat’s behavior; thus, the adequacy of the method was verified. Goat’s tracking data was obtained using a video tracking system and used to estimate whether they, which are in “estrus” or “non-estrus”, were in either states: “approaching the male”, or “standing near the male”. Totally, the PC of random forest seems to be the highest. However, The percentage concordance (PC) value besides the goats whose data were used for training data sets is relatively low. It is suggested that random forest tend to over-fit to training data. Besides random forest, the PC of HMMs and SVMs is high. However, considering the calculation time and HMM’s advantage in that it is a time series model, HMM is better method. The PC of neural network is totally low, however, if the more goat’s data were acquired, neural network would be an adequate method for estimation.

Download Full-text

Landslide susceptibility mapping based on convolutional neural network and conventional machine learning methods

10.21203/rs.3.rs-190195/v1 ◽

2021 ◽

Author(s):

Rui Liu ◽

Xin Yang ◽

Chong Xu ◽

Luyao Li ◽

Xiangqiang Zeng

Keyword(s):

Neural Network ◽

Machine Learning ◽

Convolutional Neural Network ◽

Landslide Susceptibility ◽

Susceptibility Mapping ◽

Landslide Susceptibility Mapping ◽

Support Vector ◽

Learning Methods ◽

Machine Learning Methods ◽

Conventional Machine

Abstract Landslide susceptibility mapping (LSM) is a useful tool to estimate the probability of landslide occurrence, providing a scientific basis for natural hazards prevention, land use planning, and economic development in landslide-prone areas. To date, a large number of machine learning methods have been applied to LSM, and recently the advanced Convolutional Neural Network (CNN) has been gradually adopted to enhance the prediction accuracy of LSM. The objective of this study is to introduce a CNN based model in LSM and systematically compare its overall performance with the conventional machine learning models of random forest, logistic regression, and support vector machine. Herein, we selected the Jiuzhaigou region in Sichuan Province, China as the study area. A total number of 710 landslides and 12 predisposing factors were stacked to form spatial datasets for LSM. The ROC analysis and several statistical metrics, such as accuracy, root mean square error (RMSE), Kappa coefficient, sensitivity, and specificity were used to evaluate the performance of the models in the training and validation datasets. Finally, the trained models were calculated and the landslide susceptibility zones were mapped. Results suggest that both CNN and conventional machine-learning based models have a satisfactory performance (AUC: 85.72% − 90.17%). The CNN based model exhibits excellent good-of-fit and prediction capability, and achieves the highest performance (AUC: 90.17%) but also significantly reduces the salt-of-pepper effect, which indicates its great potential of application to LSM.

Download Full-text

Improving Machine Learning Identification of Unsafe Driver Behavior by Means of Sensor Fusion

Applied Sciences ◽

10.3390/app10186417 ◽

2020 ◽

Vol 10 (18) ◽

pp. 6417 ◽

Cited By ~ 1

Author(s):

Emanuele Lattanzi ◽

Giacomo Castellucci ◽

Valerio Freschi

Keyword(s):

Neural Network ◽

Machine Learning ◽

Driver Behavior ◽

Ground Truth ◽

Support Vector ◽

Svm Classifier ◽

Learning Technology ◽

Average Accuracy ◽

Unsafe Behaviors ◽

Vehicle Sensors

Most road accidents occur due to human fatigue, inattention, or drowsiness. Recently, machine learning technology has been successfully applied to identifying driving styles and recognizing unsafe behaviors starting from in-vehicle sensors signals such as vehicle and engine speed, throttle position, and engine load. In this work, we investigated the fusion of different external sensors, such as a gyroscope and a magnetometer, with in-vehicle sensors, to increase machine learning identification of unsafe driver behavior. Starting from those signals, we computed a set of features capable to accurately describe the behavior of the driver. A support vector machine and an artificial neural network were then trained and tested using several features calculated over more than 200 km of travel. The ground truth used to evaluate classification performances was obtained by means of an objective methodology based on the relationship between speed, and lateral and longitudinal acceleration of the vehicle. The classification results showed an average accuracy of about 88% using the SVM classifier and of about 90% using the neural network demonstrating the potential capability of the proposed methodology to identify unsafe driver behaviors.

Download Full-text

HARTH: A Human Activity Recognition Dataset for Machine Learning

Sensors ◽

10.3390/s21237853 ◽

2021 ◽

Vol 21 (23) ◽

pp. 7853

Author(s):

Aleksej Logacjov ◽

Kerstin Bach ◽

Atle Kongsvold ◽

Hilde Bremseth Bårdstu ◽

Paul Jarle Mork

Keyword(s):

Neural Network ◽

Machine Learning ◽

Support Vector Machine ◽

Convolutional Neural Network ◽

Activity Recognition ◽

Human Activity ◽

Short Term Memory ◽

Human Activity Recognition ◽

Support Vector ◽

Free Living

Existing accelerometer-based human activity recognition (HAR) benchmark datasets that were recorded during free living suffer from non-fixed sensor placement, the usage of only one sensor, and unreliable annotations. We make two contributions in this work. First, we present the publicly available Human Activity Recognition Trondheim dataset (HARTH). Twenty-two participants were recorded for 90 to 120 min during their regular working hours using two three-axial accelerometers, attached to the thigh and lower back, and a chest-mounted camera. Experts annotated the data independently using the camera’s video signal and achieved high inter-rater agreement (Fleiss’ Kappa =0.96). They labeled twelve activities. The second contribution of this paper is the training of seven different baseline machine learning models for HAR on our dataset. We used a support vector machine, k-nearest neighbor, random forest, extreme gradient boost, convolutional neural network, bidirectional long short-term memory, and convolutional neural network with multi-resolution blocks. The support vector machine achieved the best results with an F1-score of 0.81 (standard deviation: ±0.18), recall of 0.85±0.13, and precision of 0.79±0.22 in a leave-one-subject-out cross-validation. Our highly professional recordings and annotations provide a promising benchmark dataset for researchers to develop innovative machine learning approaches for precise HAR in free living.

Download Full-text

Deep Learning Assisted Neonatal Cry Classification via Support Vector Machine Models

Frontiers in Public Health ◽

10.3389/fpubh.2021.670352 ◽

2021 ◽

Vol 9 ◽

Author(s):

Ashwini K ◽

P. M. Durai Raj Vincent ◽

Kathiravan Srinivasan ◽

Chuan-Yu Chang

Keyword(s):

Neural Network ◽

Machine Learning ◽

Support Vector Machine ◽

Feature Extraction ◽

Deep Learning ◽

Convolutional Neural Network ◽

Support Vector ◽

Svm Classifier ◽

Infant Cry ◽

Learning Techniques

Neonatal infants communicate with us through cries. The infant cry signals have distinct patterns depending on the purpose of the cries. Preprocessing, feature extraction, and feature selection need expert attention and take much effort in audio signals in recent days. In deep learning techniques, it automatically extracts and selects the most important features. For this, it requires an enormous amount of data for effective classification. This work mainly discriminates the neonatal cries into pain, hunger, and sleepiness. The neonatal cry auditory signals are transformed into a spectrogram image by utilizing the short-time Fourier transform (STFT) technique. The deep convolutional neural network (DCNN) technique takes the spectrogram images for input. The features are obtained from the convolutional neural network and are passed to the support vector machine (SVM) classifier. Machine learning technique classifies neonatal cries. This work combines the advantages of machine learning and deep learning techniques to get the best results even with a moderate number of data samples. The experimental result shows that CNN-based feature extraction and SVM classifier provides promising results. While comparing the SVM-based kernel techniques, namely radial basis function (RBF), linear and polynomial, it is found that SVM-RBF provides the highest accuracy of kernel-based infant cry classification system provides 88.89% accuracy.

Download Full-text

Predicting Rainfall-Induced Soil Erosion Based on a Hybridization of Adaptive Differential Evolution and Support Vector Machine Classification

Mathematical Problems in Engineering ◽

10.1155/2021/6647829 ◽

2021 ◽

Vol 2021 ◽

pp. 1-20

Author(s):

Tuan Vu Dinh ◽

Hieu Nguyen ◽

Xuan-Linh Tran ◽

Nhat-Duc Hoang

Keyword(s):

Neural Network ◽

Machine Learning ◽

Artificial Neural Network ◽

Support Vector Machine ◽

Soil Erosion ◽

Differential Evolution ◽

Influencing Factors ◽

Support Vector ◽

Adaptive Differential Evolution ◽

Artificial Neural

Soil erosion induced by rainfall is a critical problem in many regions in the world, particularly in tropical areas where the annual rainfall amount often exceeds 2000 mm. Predicting soil erosion is a challenging task, subjecting to variation of soil characteristics, slope, vegetation cover, land management, and weather condition. Conventional models based on the mechanism of soil erosion processes generally provide good results but are time-consuming due to calibration and validation. The goal of this study is to develop a machine learning model based on support vector machine (SVM) for soil erosion prediction. The SVM serves as the main prediction machinery establishing a nonlinear function that maps considered influencing factors to accurate predictions. In addition, in order to improve the accuracy of the model, the history-based adaptive differential evolution with linear population size reduction and population-wide inertia term (L-SHADE-PWI) is employed to find an optimal set of parameters for SVM. Thus, the proposed method, named L-SHADE-PWI-SVM, is an integration of machine learning and metaheuristic optimization. For the purpose of training and testing the method, a dataset consisting of 236 samples of soil erosion in Northwest Vietnam is collected with 10 influencing factors. The training set includes 90% of the original dataset; the rest of the dataset is reserved for assessing the generalization capability of the model. The experimental results indicate that the newly developed L-SHADE-PWI-SVM method is a competitive soil erosion predictor with superior performance statistics. Most importantly, L-SHADE-PWI-SVM can achieve a high classification accuracy rate of 92%, which is much better than that of backpropagation artificial neural network (87%) and radial basis function artificial neural network (78%).

Download Full-text

CAN A MACHINE LEARNING ALGORITHM IDENTIFY SARS-COV-2 VARIANTS BASED ON CONVENTIONAL rRT-PCR? PROOF OF CONCEPT

10.1101/2021.11.12.21266286 ◽

2021 ◽

Author(s):

jorge cabrera Alvargonzalez ◽

Ana Larranaga Janeiro ◽

Sonia Perez ◽

Javier Martinez Torres ◽

Lucia martinez lamas ◽

...

Keyword(s):

Neural Network ◽

Machine Learning ◽

Supervised Classification ◽

Learning Algorithm ◽

Support Vector ◽

Classification Algorithms ◽

Machine Learning Algorithm ◽

Proof Of Concept ◽

The Past ◽

Number Of Cycles

Severe acute respiratory syndrome coronavirus 2 (SARS-CoV-2) has been and remains one of the major challenges humanity has faced thus far. Over the past few months, large amounts of information have been collected that are only now beginning to be assimilated. In the present work, the existence of residual information in the massive numbers of rRT-PCRs that tested positive out of the almost half a million tests that were performed during the pandemic is investigated. This residual information is believed to be highly related to a pattern in the number of cycles that are necessary to detect positive samples as such. Thus, a database of more than 20,000 positive samples was collected, and two supervised classification algorithms (a support vector machine and a neural network) were trained to temporally locate each sample based solely and exclusively on the number of cycles determined in the rRT-PCR of each individual. Finally, the results obtained from the classification show how the appearance of each wave is coincident with the surge of each of the variants present in the region of Galicia (Spain) during the development of the SARS-CoV-2 pandemic and clearly identified with the classification algorithm.

Download Full-text

Estimating Forest Carbon Fluxes Using Machine Learning Techniques Based on Eddy Covariance Measurements

Sustainability ◽

10.3390/su10010203 ◽

2018 ◽

Vol 10 (1) ◽

pp. 203 ◽

Cited By ~ 8

Author(s):

Xianming Dou ◽

Yongguo Yang ◽

Jinhui Luo

Keyword(s):

Neural Network ◽

Machine Learning ◽

Eddy Covariance ◽

Carbon Flux ◽

Carbon Fluxes ◽

Forest Carbon ◽

Machine Learning Techniques ◽

Support Vector ◽

Inference System ◽

Learning Techniques

Approximating the complex nonlinear relationships that dominate the exchange of carbon dioxide fluxes between the biosphere and atmosphere is fundamentally important for addressing the issue of climate change. The progress of machine learning techniques has offered a number of useful tools for the scientific community aiming to gain new insights into the temporal and spatial variation of different carbon fluxes in terrestrial ecosystems. In this study, adaptive neuro-fuzzy inference system (ANFIS) and generalized regression neural network (GRNN) models were developed to predict the daily carbon fluxes in three boreal forest ecosystems based on eddy covariance (EC) measurements. Moreover, a comparison was made between the modeled values derived from these models and those of traditional artificial neural network (ANN) and support vector machine (SVM) models. These models were also compared with multiple linear regression (MLR). Several statistical indicators, including coefficient of determination (R2), Nash-Sutcliffe efficiency (NSE), bias error (Bias) and root mean square error (RMSE) were utilized to evaluate the performance of the applied models. The results showed that the developed machine learning models were able to account for the most variance in the carbon fluxes at both daily and hourly time scales in the three stands and they consistently and substantially outperformed the MLR model for both daily and hourly carbon flux estimates. It was demonstrated that the ANFIS and ANN models provided similar estimates in the testing period with an approximate value of R2 = 0.93, NSE = 0.91, Bias = 0.11 g C m−2 day−1 and RMSE = 1.04 g C m−2 day−1 for daily gross primary productivity, 0.94, 0.82, 0.24 g C m−2 day−1 and 0.72 g C m−2 day−1 for daily ecosystem respiration, and 0.79, 0.75, 0.14 g C m−2 day−1 and 0.89 g C m−2 day−1 for daily net ecosystem exchange, and slightly outperformed the GRNN and SVM models. In practical terms, however, the newly developed models (ANFIS and GRNN) are more robust and flexible, and have less parameters needed for selection and optimization in comparison with traditional ANN and SVM models. Consequently, they can be used as valuable tools to estimate forest carbon fluxes and fill the missing carbon flux data during the long-term EC measurements.

Download Full-text