Improving Machine Learning Identification of Unsafe Driver Behavior by Means of Sensor Fusion

Most road accidents occur due to human fatigue, inattention, or drowsiness. Recently, machine learning technology has been successfully applied to identifying driving styles and recognizing unsafe behaviors starting from in-vehicle sensors signals such as vehicle and engine speed, throttle position, and engine load. In this work, we investigated the fusion of different external sensors, such as a gyroscope and a magnetometer, with in-vehicle sensors, to increase machine learning identification of unsafe driver behavior. Starting from those signals, we computed a set of features capable to accurately describe the behavior of the driver. A support vector machine and an artificial neural network were then trained and tested using several features calculated over more than 200 km of travel. The ground truth used to evaluate classification performances was obtained by means of an objective methodology based on the relationship between speed, and lateral and longitudinal acceleration of the vehicle. The classification results showed an average accuracy of about 88% using the SVM classifier and of about 90% using the neural network demonstrating the potential capability of the proposed methodology to identify unsafe driver behaviors.

Download Full-text

Deep Learning Assisted Neonatal Cry Classification via Support Vector Machine Models

Frontiers in Public Health ◽

10.3389/fpubh.2021.670352 ◽

2021 ◽

Vol 9 ◽

Author(s):

Ashwini K ◽

P. M. Durai Raj Vincent ◽

Kathiravan Srinivasan ◽

Chuan-Yu Chang

Keyword(s):

Neural Network ◽

Machine Learning ◽

Support Vector Machine ◽

Feature Extraction ◽

Deep Learning ◽

Convolutional Neural Network ◽

Support Vector ◽

Svm Classifier ◽

Infant Cry ◽

Learning Techniques

Neonatal infants communicate with us through cries. The infant cry signals have distinct patterns depending on the purpose of the cries. Preprocessing, feature extraction, and feature selection need expert attention and take much effort in audio signals in recent days. In deep learning techniques, it automatically extracts and selects the most important features. For this, it requires an enormous amount of data for effective classification. This work mainly discriminates the neonatal cries into pain, hunger, and sleepiness. The neonatal cry auditory signals are transformed into a spectrogram image by utilizing the short-time Fourier transform (STFT) technique. The deep convolutional neural network (DCNN) technique takes the spectrogram images for input. The features are obtained from the convolutional neural network and are passed to the support vector machine (SVM) classifier. Machine learning technique classifies neonatal cries. This work combines the advantages of machine learning and deep learning techniques to get the best results even with a moderate number of data samples. The experimental result shows that CNN-based feature extraction and SVM classifier provides promising results. While comparing the SVM-based kernel techniques, namely radial basis function (RBF), linear and polynomial, it is found that SVM-RBF provides the highest accuracy of kernel-based infant cry classification system provides 88.89% accuracy.

Download Full-text

Classification of Crop Residue Cover in High-Resolution RGB Images Using Machine Learning

Journal of the ASABE ◽

10.13031/ja.14572 ◽

2022 ◽

Vol 65 (1) ◽

pp. 75-86

Author(s):

Parth C. Upadhyay ◽

John A. Lory ◽

Guilherme N. DeSouza ◽

Timotius A. P. Lagaunne ◽

Christine M. Spinka

Keyword(s):

Machine Learning ◽

Feature Selection ◽

Feature Selection Method ◽

Texture Features ◽

Ground Truth ◽

Selection Method ◽

Support Vector ◽

Svm Classifier ◽

Aerial Vehicle ◽

Rgb Images

HighlightsA machine learning framework estimated residue cover in RGB images taken at three resolutions from 88 locations.The best results primarily used texture features, the RFE-SVM feature selection method, and the SVM classifier.Accounting for shadows and plants plus modifying and optimizing the texture features may improve performance.An automated system developed using machine learning is a viable strategy to estimate residue cover from RGB images obtained with handheld or UAV platforms.Abstract. Maintaining plant residue on the soil surface contributes to sustainable cultivation of arable land. Applying machine learning methods to RGB images of residue could overcome the subjectivity of manual methods. The objectives of this study were to use supervised machine learning while identifying the best feature selection method, the best classifier, and the most effective image feature types for classifying residue levels in RGB imagery. Imagery was collected from 88 locations in 40 row-crop fields in five Missouri counties between early May and late June in 2018 and 2019 using a tripod-mounted camera (0.014 cm pixel-1 ground sampling distance, GSD) and an unmanned aerial vehicle (UAV, 0.05 and 0.14 GSD). At each field location, 50 contiguous 0.3 × 0.2 m region of interest (ROI) images were extracted from the imagery, resulting in a dataset of 4,400 ROI images at each GSD. Residue percentages for ground truth were estimated using a bullseye grid method (n = 100 points) based on the 0.014 GSD images. Representative color, texture, and shape features were extracted and evaluated using four feature selection methods and two classifiers. Recursive feature elimination using support vector machine (RFE-SVM) was the best feature selection method, and the SVM classifier performed best for classifying the amount of residue as a three-class problem. The best features for this application were associated with texture, with local binary pattern (LBP) features being the most prevalent for all three GSDs. Shape features were irrelevant. The three residue classes were correctly identified with 88%, 84%, and 81% 10-fold cross-validation scores for the 2018 training data and 81%, 69%, and 65% accuracy for the 2019 testing data in decreasing resolution order. Converting image-wise data (0.014 GSD) to location residue estimates using a Bayesian model showed good agreement with the location-based ground truth (r2 = 0.90). This initial assessment documents the use of RGB images to match other methods of estimating residue, with potential to replace or be used as a quality control for line-transect assessments. Keywords: Feature selection, Soil erosion, Support vector machine, Texture features, Unmanned aerial vehicle.

Download Full-text

Machine Learning for Formation Tightness Prediction and Mobility Prediction

10.2118/206208-ms ◽

2021 ◽

Author(s):

Zhaoya Fan ◽

Jichao Chen ◽

Tao Zhang ◽

Ning Shi ◽

Wei Zhang

Keyword(s):

Neural Network ◽

Machine Learning ◽

Linear Regression ◽

Success Rate ◽

Support Vector ◽

Svm Classifier ◽

Mobility Prediction ◽

Data Set ◽

Pressure Buildup ◽

Pressure Test

Abstract From the perspective of wireline formation test (WFT), formation tightness reflects the "speed" of pressure buildup while the pressure test is being conducted. We usually define a pressure test point that has a very low pressure-buildup speed as a tight point. The mobility derived from this kind of pressure point is usually less than 0.01md/cP; otherwise, the pressure points will be defined as valid points with valid formation pressure and mobility. Formation tightness reflects the formation permeability information and can be an indicator to estimate the difficulty of the WFT pumping and sampling operation. Mobility, as compared to permeability, reflects the dynamic supply capacity of the formation. A rapid and good mobility prediction based on petrophysical logging can not only directly provide valid formation productivity but can also evaluate the feasibility of the WFT and doing optimization work in advance. Compared to a time-consuming and costly drillstem test (DST) operation, the WFT is the most efficient and cost-saving method to confirm hydrocarbon presence. However, the success rate of WFT sampling operations in the deep Kuqa formation is less than 50% overall, mostly due to the formation tightness exceeding the capability of the tools. Therefore, a rapid mobility evaluation is necessary to meet WFT feasibility analysis. As companion work to a previous WFT optimization study(SPE-195932-MS), we further studied and discuss the machine learning for mobility prediction. In the previous study, we formed a mobility prediction workflow by doing a statistical analysis of more than 1000 pressure test points with several statistical mathematic methods, such as univariate linear regression (ULR), multivariate linear regression (MLR), neural network regression analysis (NNA), and decision tree classification analysis (DTA) methods. In this paper, the methods and principles of machine learning are expounded. A series of machine learning methods were tested. The algorithms that are appropriate for these specific data set were selected. Includes DTA, discriminant analysis (DA), logistic regression, support vector machine (SVM), K-nearest neighbor (KNN) for formation tightness prediction and linear regression, DTA, SVM, Gaussian process regression SVM, random tree, neural network analysis for mobility prediction. Contrastive analysis reveals that: The SVM classifier has the best result over other methods for formation tightness probability prediction. Based on R squared and RMSE analysis, linear regression, GPR, and NNA delivered relatively good results compared with other mobility prediction methods. An optimized data processing workflow was proposed, and it delivered a better result than the workflow proposed in SPE-195932-MS under the same training and testing dataset condition. The comparison between measured mobility and predicted mobility results reveals that, in most situations, the predicted mobility and measured mobility matched very well with each other. WFT were conducted in newly drilled wells. Sampling success rate also achieved 100% in these wells by optimizing the WFT tool string and sampling stations selection in advance, and NPT is significantly reduced.

Download Full-text

A Machine Learning Approach to Determine Oyster Vessel Behavior

Machine Learning and Knowledge Extraction ◽

10.3390/make1010004 ◽

2018 ◽

Vol 1 (1) ◽

pp. 64-74 ◽

Cited By ~ 2

Author(s):

Devin Joseph Frey ◽

Avdesh Mishra ◽

Md Tamjidul Hoque ◽

Mahdi Abdelguerfi ◽

Thomas Soniat

Keyword(s):

Machine Learning ◽

Satellite Communication ◽

Ground Truth ◽

Optimization Techniques ◽

Support Vector ◽

Svm Classifier ◽

Test Accuracy ◽

Trajectory Data ◽

Multi Class Classification ◽

Rule Based Classifier

In this work, we address a multi-class classification task of oyster vessel behaviors determination by classifying them into four different classes: fishing, traveling, poling (exploring) and docked (anchored). The main purpose of this work is to automate the oyster vessel behaviors determination task using machine learning and to explore different techniques to improve the accuracy of the oyster vessel behavior prediction problem. To employ machine learning technique, two important descriptors: speed and net speed, are calculated from the trajectory data, recorded by a satellite communication system (Vessel Management System, VMS) attached to the vessels fishing on the public oyster grounds of Louisiana. We constructed a support vector machine (SVM) based method which employs Radial Basis Function (RBF) as a kernel to accurately predict the behavior of oyster vessels. Several validation and parameter optimization techniques were used to improve the accuracy of the SVM classifier. A total 93% of the trajectory data from a July 2013 to August 2014 dataset consisting of 612,700 samples for which the ground truth can be obtained using rule-based classifier is used for validation and independent testing of our method. The results show that the proposed SVM based method is able to correctly classify 99.99% of 612,700 samples using the 10-fold cross validation. Furthermore, we achieved a precision of 1.00, recall of 1.00, F1-score of 1.00 and a test accuracy of 99.99%, while performing an independent test using a subset of 93% of the dataset, which consists of 31,418 points.

Download Full-text

Perbandingan Algoritma Machine Learning dalam Menilai Sebuah Lokasi Toko Ritel

Jurnal Teknik Informatika dan Sistem Informasi ◽

10.28932/jutisi.v7i1.3182 ◽

2021 ◽

Vol 7 (1) ◽

Author(s):

Kristiawan Kristiawan ◽

Andreas Widjaja

Keyword(s):

Neural Network ◽

Machine Learning ◽

Logistic Regression ◽

Random Forest ◽

Pearson Correlation ◽

Recursive Feature Elimination ◽

Support Vector ◽

Learning Technology ◽

K Nearest Neighbor ◽

Store Location

Abstract — The application of machine learning technology in various industrial fields is currently developing rapidly, including in the retail industry. This study aims to find the most accurate algorithmic model so that it can be used to help retailers choose a store location more precisely. By using several methods such as Pearson Correlation, Chi-Square Features, Recursive Feature Elimination and Tree-based to select features (predictive variables). These features are then used to train and build models using 6 different classification algorithms such as Logistic Regression, K Nearest Neighbor (KNN), Decision Tree, Random Forest, Support Vector Machine (SVM) and Neural Network to classify whether a location is recommended or not as a new store location. Keywords— Application of Machine Learning, Pearson Correlation, Random Forest, Neural Network, Logistic Regression.

Download Full-text

A machine learning approach for single cell interphase cell cycle staging

Scientific Reports ◽

10.1038/s41598-021-98489-5 ◽

2021 ◽

Vol 11 (1) ◽

Author(s):

Hemaxi Narotamo ◽

Maria Sofia Fernandes ◽

Ana Margarida Moreira ◽

Soraia Melo ◽

Raquel Seruca ◽

...

Keyword(s):

Machine Learning ◽

Cell Cycle ◽

Single Cell ◽

Cell Function ◽

Ground Truth ◽

Supervised Machine Learning ◽

Support Vector ◽

Svm Classifier ◽

Interphase Cell

AbstractThe cell nucleus is a tightly regulated organelle and its architectural structure is dynamically orchestrated to maintain normal cell function. Indeed, fluctuations in nuclear size and shape are known to occur during the cell cycle and alterations in nuclear morphology are also hallmarks of many diseases including cancer. Regrettably, automated reliable tools for cell cycle staging at single cell level using in situ images are still limited. It is therefore urgent to establish accurate strategies combining bioimaging with high-content image analysis for a bona fide classification. In this study we developed a supervised machine learning method for interphase cell cycle staging of individual adherent cells using in situ fluorescence images of nuclei stained with DAPI. A Support Vector Machine (SVM) classifier operated over normalized nuclear features using more than 3500 DAPI stained nuclei. Molecular ground truth labels were obtained by automatic image processing using fluorescent ubiquitination-based cell cycle indicator (Fucci) technology. An average F1-Score of 87.7% was achieved with this framework. Furthermore, the method was validated on distinct cell types reaching recall values higher than 89%. Our method is a robust approach to identify cells in G1 or S/G2 at the individual level, with implications in research and clinical applications.

Download Full-text

Improving Mispronunciation Detection of Arabic Words for Non-Native Learners Using Deep Convolutional Neural Network Features

Electronics ◽

10.3390/electronics9060963 ◽

2020 ◽

Vol 9 (6) ◽

pp. 963 ◽

Cited By ~ 2

Author(s):

Shamila Akhtar ◽

Fawad Hussain ◽

Fawad Riasat Raja ◽

Muhammad Ehatisham-ul-haq ◽

Naveed Khan Baloch ◽

...

Keyword(s):

Neural Network ◽

Machine Learning ◽

Language Learning ◽

Transfer Learning ◽

Error Detection ◽

Native Speaker ◽

Support Vector ◽

K Nearest Neighbor ◽

Mel Frequency Cepstral Coefficients ◽

Average Accuracy

Computer-Aided Language Learning (CALL) is growing nowadays because learning new languages is essential for communication with people of different linguistic backgrounds. Mispronunciation detection is an integral part of CALL, which is used for automatic pointing of errors for the non-native speaker. In this paper, we investigated the mispronunciation detection of Arabic words using deep Convolution Neural Network (CNN). For automated pronunciation error detection, we proposed CNN features-based model and extracted features from different layers of Alex Net (layers 6, 7, and 8) to train three machine learning classifiers; K-nearest neighbor (KNN), Support Vector Machine (SVM) and Random Forest (RF). We also used a transfer learning-based model in which feature extraction and classification are performed automatically. To evaluate the performance of the proposed method, a comprehensive evaluation is provided on these methods with a traditional machine learning-based method using Mel Frequency Cepstral Coefficients (MFCC) features. We used the same three classifiers KNN, SVM, and RF in the baseline method for mispronunciation detection. Experimental results show that with handcrafted features, transfer learning-based method and classification based on deep features extracted from Alex Net achieved an average accuracy of 73.67, 85 and 93.20 on Arabic words, respectively. Moreover, these results reveal that the proposed method with feature selection achieved the best average accuracy of 93.20% than all other methods.

Download Full-text

Binary Spectrum Feature for Improved Classiﬁer Performance

10.36227/techrxiv.12993122 ◽

2020 ◽

Author(s):

Nalika Ulapane ◽

Karthick Thiyagarajan ◽

sarath kodagoda

Keyword(s):

Machine Learning ◽

Classification Performance ◽

Feature Reduction ◽

Sensor Data ◽

Machine Learning Techniques ◽

Support Vector ◽

Svm Classifier ◽

Monitoring Task ◽

Classifier Performance ◽

Spectrum Feature

<div>Classiﬁcation has become a vital task in modern machine learning and Artiﬁcial Intelligence applications, including smart sensing. Numerous machine learning techniques are available to perform classiﬁcation. Similarly, numerous practices, such as feature selection (i.e., selection of a subset of descriptor variables that optimally describe the output), are available to improve classiﬁer performance. In this paper, we consider the case of a given supervised learning classiﬁcation task that has to be performed making use of continuous-valued features. It is assumed that an optimal subset of features has already been selected. Therefore, no further feature reduction, or feature addition, is to be carried out. Then, we attempt to improve the classiﬁcation performance by passing the given feature set through a transformation that produces a new feature set which we have named the “Binary Spectrum”. Via a case study example done on some Pulsed Eddy Current sensor data captured from an infrastructure monitoring task, we demonstrate how the classiﬁcation accuracy of a Support Vector Machine (SVM) classiﬁer increases through the use of this Binary Spectrum feature, indicating the feature transformation’s potential for broader usage.</div><div><br></div>

Download Full-text

Unraveling the deep learning gearbox in optical coherence tomography image segmentation towards explainable artificial intelligence

Communications Biology ◽

10.1038/s42003-021-01697-y ◽

2021 ◽

Vol 4 (1) ◽

Author(s):

Peter M. Maloca ◽

Philipp L. Müller ◽

Aaron Y. Lee ◽

Adnan Tufail ◽

Konstantinos Balaskas ◽

...

Keyword(s):

Neural Network ◽

Machine Learning ◽

Optical Coherence Tomography ◽

Image Segmentation ◽

Convolutional Neural Network ◽

Learning Algorithm ◽

Ground Truth ◽

Optical Coherence Tomography Image ◽

Optical Coherence ◽

Tomography Image

AbstractMachine learning has greatly facilitated the analysis of medical data, while the internal operations usually remain intransparent. To better comprehend these opaque procedures, a convolutional neural network for optical coherence tomography image segmentation was enhanced with a Traceable Relevance Explainability (T-REX) technique. The proposed application was based on three components: ground truth generation by multiple graders, calculation of Hamming distances among graders and the machine learning algorithm, as well as a smart data visualization (‘neural recording’). An overall average variability of 1.75% between the human graders and the algorithm was found, slightly minor to 2.02% among human graders. The ambiguity in ground truth had noteworthy impact on machine learning results, which could be visualized. The convolutional neural network balanced between graders and allowed for modifiable predictions dependent on the compartment. Using the proposed T-REX setup, machine learning processes could be rendered more transparent and understandable, possibly leading to optimized applications.

Download Full-text

Detection of Malicious Software by Analyzing Distinct Artifacts Using Machine Learning and Deep Learning Algorithms

Electronics ◽

10.3390/electronics10141694 ◽

2021 ◽

Vol 10 (14) ◽

pp. 1694

Author(s):

Mathew Ashik ◽

A. Jyothish ◽

S. Anandaram ◽

P. Vinod ◽

Francesco Mercaldo ◽

...

Keyword(s):

Neural Network ◽

Machine Learning ◽

Deep Learning ◽

Support Vector ◽

Malware Analysis ◽

Learning Approaches ◽

Dynamic Features ◽

System Calls ◽

Prevention Methods ◽

Structural Aspects

Malware is one of the most significant threats in today’s computing world since the number of websites distributing malware is increasing at a rapid rate. Malware analysis and prevention methods are increasingly becoming necessary for computer systems connected to the Internet. This software exploits the system’s vulnerabilities to steal valuable information without the user’s knowledge, and stealthily send it to remote servers controlled by attackers. Traditionally, anti-malware products use signatures for detecting known malware. However, the signature-based method does not scale in detecting obfuscated and packed malware. Considering that the cause of a problem is often best understood by studying the structural aspects of a program like the mnemonics, instruction opcode, API Call, etc. In this paper, we investigate the relevance of the features of unpacked malicious and benign executables like mnemonics, instruction opcodes, and API to identify a feature that classifies the executable. Prominent features are extracted using Minimum Redundancy and Maximum Relevance (mRMR) and Analysis of Variance (ANOVA). Experiments were conducted on four datasets using machine learning and deep learning approaches such as Support Vector Machine (SVM), Naïve Bayes, J48, Random Forest (RF), and XGBoost. In addition, we also evaluate the performance of the collection of deep neural networks like Deep Dense network, One-Dimensional Convolutional Neural Network (1D-CNN), and CNN-LSTM in classifying unknown samples, and we observed promising results using APIs and system calls. On combining APIs/system calls with static features, a marginal performance improvement was attained comparing models trained only on dynamic features. Moreover, to improve accuracy, we implemented our solution using distinct deep learning methods and demonstrated a fine-tuned deep neural network that resulted in an F1-score of 99.1% and 98.48% on Dataset-2 and Dataset-3, respectively.

Download Full-text