scholarly journals Machine Learning Methods for Detecting Internet-of-Things (IoT) Malware

This study aims to analyze the performance of machine learning models for detecting Internet of Things malware utilizing a recent IoT dataset. Experiments on the IoT dataset were conducted with nine well-known machine learning techniques, consisting of Logistic Regression (LR), Naive Bayes (NB), Decision Tree (DT), k-Nearest Neighbors (KNN), Support Vector Machines (SVM), Neural Networks (NN), Random Forest (RF), Bagging (BG), and Stacking (ST). The results show that the proposed model attains 100% accuracy in detecting IoT malware for DT, SVM, RF, BG; about 99.9% percent for LR, NB, KNN, NN; and only 28.16% for ST classifier. This study also shows higher performance than other proposed machine learning models evaluated on the same dataset. Therefore, the results of this study can help both the researchers and application developers in designing and building intelligent malware detection systems for IoT devices.

2021 ◽  
Vol 21 (1) ◽  
Author(s):  
Moojung Kim ◽  
Young Jae Kim ◽  
Sung Jin Park ◽  
Kwang Gi Kim ◽  
Pyung Chun Oh ◽  
...  

Abstract Background Annual influenza vaccination is an important public health measure to prevent influenza infections and is strongly recommended for cardiovascular disease (CVD) patients, especially in the current coronavirus disease 2019 (COVID-19) pandemic. The aim of this study is to develop a machine learning model to identify Korean adult CVD patients with low adherence to influenza vaccination Methods Adults with CVD (n = 815) from a nationally representative dataset of the Fifth Korea National Health and Nutrition Examination Survey (KNHANES V) were analyzed. Among these adults, 500 (61.4%) had answered "yes" to whether they had received seasonal influenza vaccinations in the past 12 months. The classification process was performed using the logistic regression (LR), random forest (RF), support vector machine (SVM), and extreme gradient boosting (XGB) machine learning techniques. Because the Ministry of Health and Welfare in Korea offers free influenza immunization for the elderly, separate models were developed for the < 65 and ≥ 65 age groups. Results The accuracy of machine learning models using 16 variables as predictors of low influenza vaccination adherence was compared; for the ≥ 65 age group, XGB (84.7%) and RF (84.7%) have the best accuracies, followed by LR (82.7%) and SVM (77.6%). For the < 65 age group, SVM has the best accuracy (68.4%), followed by RF (64.9%), LR (63.2%), and XGB (61.4%). Conclusions The machine leaning models show comparable performance in classifying adult CVD patients with low adherence to influenza vaccination.


2020 ◽  
Author(s):  
Christopher Zhou ◽  
William Grumbles ◽  
Thomas Cundari

Six machine learning models (random forest, neural network, support vector machine, k-nearest neighbors, Bayesian ridge regression, least squares linear regression) were trained on a dataset of 3d transition metal-methyl and -methane complexes to predict p<i>K<sub>a</sub></i>(C–H), a property demonstrated to be important in catalytic activity and selectivity. Results illustrate that the machine learning models are quite promising, with RMSE metrics ranging from 4.6 to 8.8 p<i>K<sub>a</sub></i> units, despite the relatively modest amount of data available to train on. Importantly, the machine learning models agreed that (a) conjugate base properties were more impactful than those of the corresponding conjugate acid, and (b) the energy of the highest occupied molecular orbital conjugate base was the most significant input feature in the prediction of p<i>K<sub>a</sub></i>(C–H). Furthermore, results from additional testing conducted using an external dataset of Sc-methyl complexes demonstrated the robustness of all models, with RMSE metrics ranging from 1.5 to 6.6 p<i>K<sub>a</sub></i> units. In all, this research demonstrates the potential of machine learning models in organometallic catalyst development.


Sensors ◽  
2020 ◽  
Vol 20 (21) ◽  
pp. 6019
Author(s):  
José Manuel Lozano Domínguez ◽  
Faroq Al-Tam ◽  
Tomás de J. Mateo Sanguino ◽  
Noélia Correia

Improving road safety through artificial intelligence-based systems is now crucial turning smart cities into a reality. Under this highly relevant and extensive heading, an approach is proposed to improve vehicle detection in smart crosswalks using machine learning models. Contrarily to classic fuzzy classifiers, machine learning models do not require the readjustment of labels that depend on the location of the system and the road conditions. Several machine learning models were trained and tested using real traffic data taken from urban scenarios in both Portugal and Spain. These include random forest, time-series forecasting, multi-layer perceptron, support vector machine, and logistic regression models. A deep reinforcement learning agent, based on a state-of-the-art double-deep recurrent Q-network, is also designed and compared with the machine learning models just mentioned. Results show that the machine learning models can efficiently replace the classic fuzzy classifier.


Geosciences ◽  
2021 ◽  
Vol 11 (7) ◽  
pp. 265
Author(s):  
Stefan Rauter ◽  
Franz Tschuchnigg

The classification of soils into categories with a similar range of properties is a fundamental geotechnical engineering procedure. At present, this classification is based on various types of cost- and time-intensive laboratory and/or in situ tests. These soil investigations are essential for each individual construction site and have to be performed prior to the design of a project. Since Machine Learning could play a key role in reducing the costs and time needed for a suitable site investigation program, the basic ability of Machine Learning models to classify soils from Cone Penetration Tests (CPT) is evaluated. To find an appropriate classification model, 24 different Machine Learning models, based on three different algorithms, are built and trained on a dataset consisting of 1339 CPT. The applied algorithms are a Support Vector Machine, an Artificial Neural Network and a Random Forest. As input features, different combinations of direct cone penetration test data (tip resistance qc, sleeve friction fs, friction ratio Rf, depth d), combined with “defined”, thus, not directly measured data (total vertical stresses σv, effective vertical stresses σ’v and hydrostatic pore pressure u0), are used. Standard soil classes based on grain size distributions and soil classes based on soil behavior types according to Robertson are applied as targets. The different models are compared with respect to their prediction performance and the required learning time. The best results for all targets were obtained with models using a Random Forest classifier. For the soil classes based on grain size distribution, an accuracy of about 75%, and for soil classes according to Robertson, an accuracy of about 97–99%, was reached.


Minerals ◽  
2021 ◽  
Vol 11 (10) ◽  
pp. 1128
Author(s):  
Sebeom Park ◽  
Dahee Jung ◽  
Hoang Nguyen ◽  
Yosoon Choi

This study proposes a method for diagnosing problems in truck ore transport operations in underground mines using four machine learning models (i.e., Gaussian naïve Bayes (GNB), k-nearest neighbor (kNN), support vector machine (SVM), and classification and regression tree (CART)) and data collected by an Internet of Things system. A limestone underground mine with an applied mine production management system (using a tablet computer and Bluetooth beacon) is selected as the research area, and log data related to the truck travel time are collected. The machine learning models are trained and verified using the collected data, and grid search through 5-fold cross-validation is performed to improve the prediction accuracy of the models. The accuracy of CART is highest when the parameters leaf and split are set to 1 and 4, respectively (94.1%). In the validation of the machine learning models performed using the validation dataset (1500), the accuracy of the CART was 94.6%, and the precision and recall were 93.5% and 95.7%, respectively. In addition, it is confirmed that the F1 score reaches values as high as 94.6%. Through field application and analysis, it is confirmed that the proposed CART model can be utilized as a tool for monitoring and diagnosing the status of truck ore transport operations.


2020 ◽  
Vol 12 (16) ◽  
pp. 2655 ◽  
Author(s):  
Hugo Crisóstomo de Castro Filho ◽  
Osmar Abílio de Carvalho Júnior ◽  
Osmar Luiz Ferreira de Carvalho ◽  
Pablo Pozzobon de Bem ◽  
Rebeca dos Santos de Moura ◽  
...  

The Synthetic Aperture Radar (SAR) time series allows describing the rice phenological cycle by the backscattering time signature. Therefore, the advent of the Copernicus Sentinel-1 program expands studies of radar data (C-band) for rice monitoring at regional scales, due to the high temporal resolution and free data distribution. Recurrent Neural Network (RNN) model has reached state-of-the-art in the pattern recognition of time-sequenced data, obtaining a significant advantage at crop classification on the remote sensing images. One of the most used approaches in the RNN model is the Long Short-Term Memory (LSTM) model and its improvements, such as Bidirectional LSTM (Bi-LSTM). Bi-LSTM models are more effective as their output depends on the previous and the next segment, in contrast to the unidirectional LSTM models. The present research aims to map rice crops from Sentinel-1 time series (band C) using LSTM and Bi-LSTM models in West Rio Grande do Sul (Brazil). We compared the results with traditional Machine Learning techniques: Support Vector Machines (SVM), Random Forest (RF), k-Nearest Neighbors (k-NN), and Normal Bayes (NB). The developed methodology can be subdivided into the following steps: (a) acquisition of the Sentinel time series over two years; (b) data pre-processing and minimizing noise from 3D spatial-temporal filters and smoothing with Savitzky-Golay filter; (c) time series classification procedures; (d) accuracy analysis and comparison among the methods. The results show high overall accuracy and Kappa (>97% for all methods and metrics). Bi-LSTM was the best model, presenting statistical differences in the McNemar test with a significance of 0.05. However, LSTM and Traditional Machine Learning models also achieved high accuracy values. The study establishes an adequate methodology for mapping the rice crops in West Rio Grande do Sul.


2021 ◽  
Author(s):  
Siddhartha Bhattacharyya ◽  
Parth Ganeriwala ◽  
Shreya Nandanwar ◽  
Raja Muthalagu ◽  
anubhav gupta

Internet of Things (IoT) are the most commonly used devices today, that provide services that have become widely prevalent. With their success and growing need, the number of threats and attacks against IoT devices and services have been increasing exponentially. With the increase in knowledge of IoT related threats and adequate monitoring technologies, the potential to detect these threats is becoming a reality. There have been various studies consisting of fingerprinting based approaches on device identification but none have taken into account the full protocol spectrum. IPAssess is a novel fingerprinting based model which takes a feature set based on the correlation between the device characteristics and the protocols and then applies various machine learning models to perform device identification and classification. We have also used aggregation and augmentation to enhance the algorithm. In our experimental study, IPAssess performs IoT device identification with a 99.6\% classification accuracy.


2021 ◽  
Author(s):  
Siddhartha Bhattacharyya ◽  
Parth Ganeriwala ◽  
Shreya Nandanwar ◽  
Raja Muthalagu ◽  
anubhav gupta

Internet of Things (IoT) are the most commonly used devices today, that provide services that have become widely prevalent. With their success and growing need, the number of threats and attacks against IoT devices and services have been increasing exponentially. With the increase in knowledge of IoT related threats and adequate monitoring technologies, the potential to detect these threats is becoming a reality. There have been various studies consisting of fingerprinting based approaches on device identification but none have taken into account the full protocol spectrum. IPAssess is a novel fingerprinting based model which takes a feature set based on the correlation between the device characteristics and the protocols and then applies various machine learning models to perform device identification and classification. We have also used aggregation and augmentation to enhance the algorithm. In our experimental study, IPAssess performs IoT device identification with a 99.6\% classification accuracy.


Author(s):  
Christopher Zhou ◽  
William Grumbles ◽  
Thomas Cundari

Six machine learning models (random forest, neural network, support vector machine, k-nearest neighbors, Bayesian ridge regression, least squares linear regression) were trained on a dataset of 3d transition metal-methyl and -methane complexes to predict p<i>K<sub>a</sub></i>(C–H), a property demonstrated to be important in catalytic activity and selectivity. Results illustrate that the machine learning models are quite promising, with RMSE metrics ranging from 4.6 to 8.8 p<i>K<sub>a</sub></i> units, despite the relatively modest amount of data available to train on. Importantly, the machine learning models agreed that (a) conjugate base properties were more impactful than those of the corresponding conjugate acid, and (b) the energy of the highest occupied molecular orbital conjugate base was the most significant input feature in the prediction of p<i>K<sub>a</sub></i>(C–H). Furthermore, results from additional testing conducted using an external dataset of Sc-methyl complexes demonstrated the robustness of all models, with RMSE metrics ranging from 1.5 to 6.6 p<i>K<sub>a</sub></i> units. In all, this research demonstrates the potential of machine learning models in organometallic catalyst development.


Sign in / Sign up

Export Citation Format

Share Document