Applying Machine Learning for Improving Performance Classification on Driving Behavior

Traffic accident is a very difficult problem to handle on a large scale in a country. Indonesia is one of the most populated, developing countries that use vehicles for daily activities as its main transportation. It is also the country with the largest number of car users in Southeast Asia, so driving safety needs to be considered. Using machine learning classification method to determine whether a driver is driving safely or not can help reduce the risk of driving accidents. We created a detection system to classify whether the driver is driving safely or unsafely using trip sensor data, which include Gyroscope, Acceleration, and GPS. The classification methods used in this study are Random Forest (RF) classification algorithm, Support Vector Machine (SVM), and Multilayer Perceptron (MLP) by improving data preprocessing using feature extraction and oversampling methods. This study shows that RF has the best performance with 98% accuracy, 98% precision, and 97% sensitivity using the proposed preprocessing stages compared to SVM or MLP.

Download Full-text

Applying Machine Learning for Improving Performance Classification on Driving Behavior

IJITEE (International Journal of Information Technology and Electrical Engineering) ◽

10.22146/ijitee.55599 ◽

2020 ◽

Vol 4 (1) ◽

pp. 8

Author(s):

Ahmad Iwan Fadli ◽

Selo Sulistyo ◽

Sigit Basuki Wibowo

Keyword(s):

Machine Learning ◽

Large Scale ◽

Traffic Accidents ◽

Detection System ◽

Performance Testing ◽

Sensor Data ◽

Support Vector ◽

Machine Learning Classification ◽

Detection Systems ◽

Driving Behaviors

Driving accidents are serious events that could cause fatality. According to WHO’s reports, reckless driving behaviors such as speeding, driving under influence, and operating phones while driving are among the main factors that could reduce the focus of drivers while driving. Driving accidents are also difficult to handle on a large scale in a country. Using machine learning classification method to determine whether a driver is driving safely or not can help reduce the risk of driving accidents. Drivers tend to be more careful when they know that their driving behaviors are being monitored. We created a classifier model that can be applied to detection systems to classify whether a driver is driving safely or not safely using travel sensor data, which includes gyroscope, accelerometer, and GPS. The classification methods used in this study are Random Forest (RF) classification algorithm, Support Vector Machine (SVM), and Multilayer Perceptron (MLP). This study shows that RF has the best performance with 98% accuracy, 98% precision, and sensitivity 97%. Performance testing shows that the proposed pre-processing method can increase the classifier sensitivity value in the research dataset. It is hoped that the classifier model can be applied to the driving detection system so that it can reduce the risk of traffic accidents.

Download Full-text

FCS-MBFLEACH: Designing an Energy-Aware Fault Detection System for Mobile Wireless Sensor Networks

Mathematics ◽

10.3390/math8010028 ◽

2019 ◽

Vol 8 (1) ◽

pp. 28 ◽

Cited By ~ 3

Author(s):

Shahaboddin Shamshirband ◽

Javad Hassannataj Joloudari ◽

Mohammad GhasemiGol ◽

Hamid Saadatfar ◽

Amir Mosavi ◽

...

Keyword(s):

Wireless Sensor Networks ◽

Sensor Networks ◽

Fault Detection ◽

Large Scale ◽

Detection System ◽

Sensor Nodes ◽

Wireless Sensor ◽

Support Vector ◽

Detection Accuracy ◽

Classification Methods

Wireless sensor networks (WSNs) include large-scale sensor nodes that are densely distributed over a geographical region that is completely randomized for monitoring, identifying, and analyzing physical events. The crucial challenge in wireless sensor networks is the very high dependence of the sensor nodes on limited battery power to exchange information wirelessly as well as the non-rechargeable battery of the wireless sensor nodes, which makes the management and monitoring of these nodes in terms of abnormal changes very difficult. These anomalies appear under faults, including hardware, software, anomalies, and attacks by raiders, all of which affect the comprehensiveness of the data collected by wireless sensor networks. Hence, a crucial contraption should be taken to detect the early faults in the network, despite the limitations of the sensor nodes. Machine learning methods include solutions that can be used to detect the sensor node faults in the network. The purpose of this study is to use several classification methods to compute the fault detection accuracy with different densities under two scenarios in regions of interest such as MB-FLEACH, one-class support vector machine (SVM), fuzzy one-class, or a combination of SVM and FCS-MBFLEACH methods. It should be noted that in the study so far, no super cluster head (SCH) selection has been performed to detect node faults in the network. The simulation outcomes demonstrate that the FCS-MBFLEACH method has the best performance in terms of the accuracy of fault detection, false-positive rate (FPR), average remaining energy, and network lifetime compared to other classification methods.

Download Full-text

Comparison of machine learning classification algorithms for land cover change in a coastal area affected by the 2010 Earthquake and Tsunami in Chile

10.5194/nhess-2020-41 ◽

2020 ◽

Author(s):

Matias I. Volke ◽

Rodrigo Abarca-Del-Rio

Keyword(s):

Machine Learning ◽

Land Cover ◽

Performance Testing ◽

Machine Learning Algorithms ◽

Support Vector ◽

Land Cover Changes ◽

Classification Methods ◽

Machine Learning Classification ◽

Geomorphological Changes ◽

Natural Disaster Management

Abstract. Earthquakes and tsunamis are the natural events that generate subsequent geomorphological land cover changes. The damage is usually of such importance and of such a diverse nature that it is imperative to have tools that allow quick and precise monitoring. Thus, know which classification methods have the best potential to obtain greater precision will improve natural disaster management. We analyze Tubul locality (37.21º S; 73.43º O) in Biobío region, Chile, in which greatest geomorphological changes were documented after the earthquake and tsunami occurred 27/February/2010. These changes can be analyzed using different machine learning methods. We investigate the Support Vector Machine (SVM) and Random Forest (RF), versus the Maximum Likelihood (ML) classification method of Landsat TM and ASTER satellite images. The comparison of the performance of the classifiers and certifying accuracy improvement shows that machine learning algorithms are superior to traditional classification methods in terms of overall accuracy and robustness. The general classification accuracy was approximately 97 %. We also visualize the land cover transformations, showing that 26 % of the region was altered. The results of performance testing of machine learning methodologies was consistent with other studies and presents a valid application in the visualization of land cover changes in areas of natural disasters.

Download Full-text

Binary Spectrum Feature for Improved Classiﬁer Performance

10.36227/techrxiv.12993122 ◽

2020 ◽

Author(s):

Nalika Ulapane ◽

Karthick Thiyagarajan ◽

sarath kodagoda

Keyword(s):

Machine Learning ◽

Classification Performance ◽

Feature Reduction ◽

Sensor Data ◽

Machine Learning Techniques ◽

Support Vector ◽

Svm Classifier ◽

Monitoring Task ◽

Classifier Performance ◽

Spectrum Feature

<div>Classiﬁcation has become a vital task in modern machine learning and Artiﬁcial Intelligence applications, including smart sensing. Numerous machine learning techniques are available to perform classiﬁcation. Similarly, numerous practices, such as feature selection (i.e., selection of a subset of descriptor variables that optimally describe the output), are available to improve classiﬁer performance. In this paper, we consider the case of a given supervised learning classiﬁcation task that has to be performed making use of continuous-valued features. It is assumed that an optimal subset of features has already been selected. Therefore, no further feature reduction, or feature addition, is to be carried out. Then, we attempt to improve the classiﬁcation performance by passing the given feature set through a transformation that produces a new feature set which we have named the “Binary Spectrum”. Via a case study example done on some Pulsed Eddy Current sensor data captured from an infrastructure monitoring task, we demonstrate how the classiﬁcation accuracy of a Support Vector Machine (SVM) classiﬁer increases through the use of this Binary Spectrum feature, indicating the feature transformation’s potential for broader usage.</div><div> </div>

Download Full-text

Automatic Identification of Upper Extremity Rehabilitation Exercise Type and Dose Using Body-Worn Sensors and Machine Learning: A Pilot Study

Digital Biomarkers ◽

10.1159/000516619 ◽

2021 ◽

pp. 158-166

Author(s):

Noah Balestra ◽

Gaurav Sharma ◽

Linda M. Riek ◽

Ania Busza

Keyword(s):

Machine Learning ◽

Upper Extremity ◽

Sensor Data ◽

Inpatient Setting ◽

Accelerometer Data ◽

Data Set ◽

Machine Learning Classification ◽

Exercise Type ◽

Exercise Dose ◽

Rehabilitation Exercises

Background: Prior studies suggest that participation in rehabilitation exercises improves motor function poststroke; however, studies on optimal exercise dose and timing have been limited by the technical challenge of quantifying exercise activities over multiple days. Objectives: The objectives of this study were to assess the feasibility of using body-worn sensors to track rehabilitation exercises in the inpatient setting and investigate which recording parameters and data analysis strategies are sufficient for accurately identifying and counting exercise repetitions. Methods: MC10 BioStampRC® sensors were used to measure accelerometer and gyroscope data from upper extremities of healthy controls (n = 13) and individuals with upper extremity weakness due to recent stroke (n = 13) while the subjects performed 3 preselected arm exercises. Sensor data were then labeled by exercise type and this labeled data set was used to train a machine learning classification algorithm for identifying exercise type. The machine learning algorithm and a peak-finding algorithm were used to count exercise repetitions in non-labeled data sets. Results: We achieved a repetition counting accuracy of 95.6% overall, and 95.0% in patients with upper extremity weakness due to stroke when using both accelerometer and gyroscope data. Accuracy was decreased when using fewer sensors or using accelerometer data alone. Conclusions: Our exploratory study suggests that body-worn sensor systems are technically feasible, well tolerated in subjects with recent stroke, and may ultimately be useful for developing a system to measure total exercise “dose” in poststroke patients during clinical rehabilitation or clinical trials.

Download Full-text

QUBO formulations for training machine learning models

Scientific Reports ◽

10.1038/s41598-021-89461-4 ◽

2021 ◽

Vol 11 (1) ◽

Author(s):

Prasanna Date ◽

Davis Arthur ◽

Lauren Pusey-Nazzaro

Keyword(s):

Machine Learning ◽

Linear Regression ◽

Large Scale ◽

Support Vector ◽

Quantum Computers ◽

Np Hard ◽

Learning Models ◽

Moore’S Law ◽

Moore's Law ◽

Machine Learning Models

AbstractTraining machine learning models on classical computers is usually a time and compute intensive process. With Moore’s law nearing its inevitable end and an ever-increasing demand for large-scale data analysis using machine learning, we must leverage non-conventional computing paradigms like quantum computing to train machine learning models efficiently. Adiabatic quantum computers can approximately solve NP-hard problems, such as the quadratic unconstrained binary optimization (QUBO), faster than classical computers. Since many machine learning problems are also NP-hard, we believe adiabatic quantum computers might be instrumental in training machine learning models efficiently in the post Moore’s law era. In order to solve problems on adiabatic quantum computers, they must be formulated as QUBO problems, which is very challenging. In this paper, we formulate the training problems of three machine learning models—linear regression, support vector machine (SVM) and balanced k-means clustering—as QUBO problems, making them conducive to be trained on adiabatic quantum computers. We also analyze the computational complexities of our formulations and compare them to corresponding state-of-the-art classical approaches. We show that the time and space complexities of our formulations are better (in case of SVM and balanced k-means clustering) or equivalent (in case of linear regression) to their classical counterparts.

Download Full-text

A Two-Stage Machine Learning Classification Approach to Identify Extremism in Arabic Opinions

International Journal of Advanced Trends in Computer Science and Engineering ◽

10.30534/ijatcse/2021/391022021 ◽

2021 ◽

Vol 10 (2) ◽

pp. 736-745

Keyword(s):

Machine Learning ◽

Binary Classification ◽

Feature Selection Method ◽

Support Vector ◽

Two Stage ◽

Machine Learning Classification ◽

Second Stage ◽

Testing Data ◽

Stage Classification ◽

Positive Dataset

The increased usage of the Internet and social networks allowed and enabled people to express their views, which have generated an increasing attention lately. Sentiment Analysis (SA) techniques are used to determine the polarity of information, either positive or negative, toward a given topic, including opinions. In this research, we have introduced a machine learning approach based on Support Vector Machine (SVM), Naïve Bayes (NB) and Random Forest (RF) classifiers, to find and classify extreme opinions in Arabic reviews. To achieve this, a dataset of 1500 Arabic reviews was collected from Google Play Store. In addition, a two-stage Classification process was applied to classify the reviews. In the first stage, we built a binary classifier to sort out positive from negative reviews. In the second stage, however we applied a binary classification mechanism based on a set of proposed rules that distinguishes extreme positive from positive reviews, and extreme negative from negative reviews. Four major experiments were conducted with a total of 10 different sub experiments to fulfill the two-stage process using different X-validation schemas and Term Frequency-Inverse Document Frequency feature selection method. Obtained results have indicated that SVM was the best during the first stage classification with 30% testing data, and NB was the best with 20% testing data. The results of the second stage classification indicated that SVM has scored better results in identifying extreme positive reviews when dealing with the positive dataset with an overall accuracy of 68.7% and NB showed better accuracy results in identifying extreme negative reviews when dealing with the negative dataset, with an overall accuracy of 72.8%.

Download Full-text

Machine-Learning for the Prediction of Lost Circulation Events - Time Series Analysis and Model Evaluation

10.2118/204706-ms ◽

2021 ◽

Author(s):

Arturo Magana-Mora ◽

Mohammad AlJubran ◽

Jothibasu Ramasamy ◽

Mohammed AlBassam ◽

Chinthaka Gooneratne ◽

...

Keyword(s):

Machine Learning ◽

Time Series ◽

Real Time ◽

Large Scale ◽

Model Comparison ◽

Drilling Fluid ◽

Sensor Data ◽

False Alarms ◽

Suitable Model ◽

Lost Circulation

Abstract Objective/Scope. Lost circulation events (LCEs) are among the top causes for drilling nonproductive time (NPT). The presence of natural fractures and vugular formations causes loss of drilling fluid circulation. Drilling depleted zones with incorrect mud weights can also lead to drilling induced losses. LCEs can also develop into additional drilling hazards, such as stuck pipe incidents, kicks, and blowouts. An LCE is traditionally diagnosed only when there is a reduction in mud volume in mud pits in the case of moderate losses or reduction of mud column in the annulus in total losses. Using machine learning (ML) for predicting the presence of a loss zone and the estimation of fracture parameters ahead is very beneficial as it can immediately alert the drilling crew in order for them to take the required actions to mitigate or cure LCEs. Methods, Procedures, Process. Although different computational methods have been proposed for the prediction of LCEs, there is a need to further improve the models and reduce the number of false alarms. Robust and generalizable ML models require a sufficiently large amount of data that captures the different parameters and scenarios representing an LCE. For this, we derived a framework that automatically searches through historical data, locates LCEs, and extracts the surface drilling and rheology parameters surrounding such events. Results, Observations, and Conclusions. We derived different ML models utilizing various algorithms and evaluated them using the data-split technique at the level of wells to find the most suitable model for the prediction of an LCE. From the model comparison, random forest classifier achieved the best results and successfully predicted LCEs before they occurred. The developed LCE model is designed to be implemented in the real-time drilling portal as an aid to the drilling engineers and the rig crew to minimize or avoid NPT. Novel/Additive Information. The main contribution of this study is the analysis of real-time surface drilling parameters and sensor data to predict an LCE from a statistically representative number of wells. The large-scale analysis of several wells that appropriately describe the different conditions before an LCE is critical for avoiding model undertraining or lack of model generalization. Finally, we formulated the prediction of LCEs as a time-series problem and considered parameter trends to accurately determine the early signs of LCEs.

Download Full-text

Machine learning classification methods informing the management of inconclusive reactors at bovine tuberculosis surveillance tests in England

Preventive Veterinary Medicine ◽

10.1016/j.prevetmed.2021.105565 ◽

2021 ◽

pp. 105565

Author(s):

M. Pilar Romero ◽

Yu-Mei Chang ◽

Lucy A. Brunton ◽

Jessica Parry ◽

Alison Prosser ◽

...

Keyword(s):

Machine Learning ◽

Bovine Tuberculosis ◽

Classification Methods ◽

Machine Learning Classification

Download Full-text

Predicting ionizing radiation exposure using biochemically-inspired genomic machine learning

F1000Research ◽

10.12688/f1000research.14048.1 ◽

2018 ◽

Vol 7 ◽

pp. 233

Author(s):

Jonathan Z.L. Zhao ◽

Eliseos J. Mucaki ◽

Peter K. Rogan

Keyword(s):

Machine Learning ◽

Ionizing Radiation ◽

Radiation Exposure ◽

Large Scale ◽

Nearest Neighbor ◽

Error Rates ◽

Support Vector ◽

Dose Estimation ◽

Gene Signatures ◽

Ionizing Radiation Exposure

Background: Gene signatures derived from transcriptomic data using machine learning methods have shown promise for biodosimetry testing. These signatures may not be sufficiently robust for large scale testing, as their performance has not been adequately validated on external, independent datasets. The present study develops human and murine signatures with biochemically-inspired machine learning that are strictly validated using k-fold and traditional approaches. Methods: Gene Expression Omnibus (GEO) datasets of exposed human and murine lymphocytes were preprocessed via nearest neighbor imputation and expression of genes implicated in the literature to be responsive to radiation exposure (n=998) were then ranked by Minimum Redundancy Maximum Relevance (mRMR). Optimal signatures were derived by backward, complete, and forward sequential feature selection using Support Vector Machines (SVM), and validated using k-fold or traditional validation on independent datasets. Results: The best human signatures we derived exhibit k-fold validation accuracies of up to 98% (DDB2, PRKDC, TPP2, PTPRE, and GADD45A) when validated over 209 samples and traditional validation accuracies of up to 92% (DDB2, CD8A, TALDO1, PCNA, EIF4G2, LCN2, CDKN1A, PRKCH, ENO1, and PPM1D) when validated over 85 samples. Some human signatures are specific enough to differentiate between chemotherapy and radiotherapy. Certain multi-class murine signatures have sufficient granularity in dose estimation to inform eligibility for cytokine therapy (assuming these signatures could be translated to humans). We compiled a list of the most frequently appearing genes in the top 20 human and mouse signatures. More frequently appearing genes among an ensemble of signatures may indicate greater impact of these genes on the performance of individual signatures. Several genes in the signatures we derived are present in previously proposed signatures. Conclusions: Gene signatures for ionizing radiation exposure derived by machine learning have low error rates in externally validated, independent datasets, and exhibit high specificity and granularity for dose estimation.

Download Full-text