Performance of Machine Learning-Based Multi-Model Voting Ensemble Methods for Network Threat Detection in Agriculture 4.0

The upcoming agricultural revolution, known as Agriculture 4.0, integrates cutting-edge Information and Communication Technologies in existing operations. Various cyber threats related to the aforementioned integration have attracted increasing interest from security researchers. Network traffic analysis and classification based on Machine Learning (ML) methodologies can play a vital role in tackling such threats. Towards this direction, this research work presents and evaluates different ML classifiers for network traffic classification, i.e., K-Nearest Neighbors (KNN), Support Vector Classification (SVC), Decision Tree (DT), Random Forest (RF) and Stochastic Gradient Descent (SGD), as well as a hard voting and a soft voting ensemble model of these classifiers. In the context of this research work, three variations of the NSL-KDD dataset were utilized, i.e., initial dataset, undersampled dataset and oversampled dataset. The performance of the individual ML algorithms was evaluated in all three dataset variations and was compared to the performance of the voting ensemble methods. In most cases, both the hard and the soft voting models were found to perform better in terms of accuracy compared to the individual models.

Download Full-text

Searching for optimal machine learning algorithm for network traffic classification in intrusion detection system

ITM Web of Conferences ◽

10.1051/itmconf/20182100027 ◽

2018 ◽

Vol 21 ◽

pp. 00027

Author(s):

Alicja Gerka

Keyword(s):

Machine Learning ◽

Intrusion Detection ◽

Network Traffic ◽

Detection System ◽

Learning Algorithms ◽

Attack Detection ◽

Machine Learning Algorithms ◽

Support Vector ◽

Traffic Classification ◽

Network Traffic Classification

The main problem associated with the development of an effective network behaviour anomaly detection-based IDS model is the selection of the optimal network traffic classification method. This article presents the results of simulation research on the effectiveness of the use of machine learning algorithms in the network attacks detection. The research part of the work concerned finding the optimal method of network packets classification possible to implement in the intrusion detection system’s attack detection module. During the research, the performance of three machine learning algorithms (Artificial Neural Network, Support Vector Machine and Naïve Bayes Classifier) has been compared using a dataset from the KDD Cup competition. Attention was also paid to the relationship between the values of algorithm parameters and their effectiveness. The work also contains an short analysis of the state of cybersecurity in Poland.

Download Full-text

A Comparative Study of Traffic Classification Techniques for Smart City Networks

Sensors ◽

10.3390/s21144677 ◽

2021 ◽

Vol 21 (14) ◽

pp. 4677

Author(s):

Razan M. AlZoman ◽

Mohammed J. F. Alenazi

Keyword(s):

Machine Learning ◽

Decision Tree ◽

Network Management ◽

Network Traffic ◽

Smart City ◽

Learning Algorithms ◽

Machine Learning Algorithms ◽

Support Vector ◽

Traffic Classification ◽

Network Traffic Classification

Smart city networks involve many applications that impose specific Quality of Service (QoS) requirements, thus representing a challenging scenario for network management. Solutions aiming to guarantee QoS support have not been deployed in large-scale networks. Traffic classification is a mechanism used to manage different aspects, including QoS requirements. However, conventional traffic classification methods, such as the port-based method, are inefficient because of their inability to handle dynamic port allocation and encryption. Traffic classification using machine learning has gained research interest as an alternative method to achieve high performance. In fact, machine learning embeds intelligence into network functions, thus improving network management. In this study, we apply machine learning algorithms to predict network traffic classification. We apply four supervised learning algorithms: support vector machine, random forest, k-nearest neighbors, and decision tree. We also apply a port-based method of traffic classification based on applications’ popular assigned port numbers. Then, we compare the results of this method to those obtained from the machine learning algorithms. The evaluation results indicate that the decision tree algorithm provides the highest average accuracy among the evaluated algorithms, at 99.18%. Moreover, network traffic classification using machine learning provides more accurate results and higher performance than the port-based method.

Download Full-text

Machine Learning for Sensorless Temperature Estimation of a BLDC Motor

Sensors ◽

10.3390/s21144655 ◽

2021 ◽

Vol 21 (14) ◽

pp. 4655

Author(s):

Dariusz Czerwinski ◽

Jakub Gęca ◽

Krzysztof Kolano

Keyword(s):

Machine Learning ◽

Temperature Measurement ◽

Stochastic Gradient Descent ◽

Estimation Accuracy ◽

Coefficient Of Determination ◽

Percentage Error ◽

Support Vector ◽

Bldc Motor ◽

Temperature Estimation ◽

Motor Operation

In this article, the authors propose two models for BLDC motor winding temperature estimation using machine learning methods. For the purposes of the research, measurements were made for over 160 h of motor operation, and then, they were preprocessed. The algorithms of linear regression, ElasticNet, stochastic gradient descent regressor, support vector machines, decision trees, and AdaBoost were used for predictive modeling. The ability of the models to generalize was achieved by hyperparameter tuning with the use of cross-validation. The conducted research led to promising results of the winding temperature estimation accuracy. In the case of sensorless temperature prediction (model 1), the mean absolute percentage error MAPE was below 4.5% and the coefficient of determination R2 was above 0.909. In addition, the extension of the model with the temperature measurement on the casing (model 2) allowed reducing the error value to about 1% and increasing R2 to 0.990. The results obtained for the first proposed model show that the overheating protection of the motor can be ensured without direct temperature measurement. In addition, the introduction of a simple casing temperature measurement system allows for an estimation with accuracy suitable for compensating the motor output torque changes related to temperature.

Download Full-text

Comparison of Ensemble Machine Learning Methods for Soil Erosion Pin Measurements

ISPRS International Journal of Geo-Information ◽

10.3390/ijgi10010042 ◽

2021 ◽

Vol 10 (1) ◽

pp. 42

Author(s):

Kieu Anh Nguyen ◽

Walter Chen ◽

Bor-Shiun Lin ◽

Uma Seeboonruang

Keyword(s):

Machine Learning ◽

Soil Erosion ◽

Ensemble Methods ◽

Machine Learning Algorithms ◽

Multivariate Adaptive Regression Splines ◽

Gradient Boosting ◽

Support Vector ◽

Ensemble Machine Learning ◽

Boosting Method ◽

Bagging Method

Although machine learning has been extensively used in various fields, it has only recently been applied to soil erosion pin modeling. To improve upon previous methods of quantifying soil erosion based on erosion pin measurements, this study explored the possible application of ensemble machine learning algorithms to the Shihmen Reservoir watershed in northern Taiwan. Three categories of ensemble methods were considered in this study: (a) Bagging, (b) boosting, and (c) stacking. The bagging method in this study refers to bagged multivariate adaptive regression splines (bagged MARS) and random forest (RF), and the boosting method includes Cubist and gradient boosting machine (GBM). Finally, the stacking method is an ensemble method that uses a meta-model to combine the predictions of base models. This study used RF and GBM as the meta-models, decision tree, linear regression, artificial neural network, and support vector machine as the base models. The dataset used in this study was sampled using stratified random sampling to achieve a 70/30 split for the training and test data, and the process was repeated three times. The performance of six ensemble methods in three categories was analyzed based on the average of three attempts. It was found that GBM performed the best among the ensemble models with the lowest root-mean-square error (RMSE = 1.72 mm/year), the highest Nash-Sutcliffe efficiency (NSE = 0.54), and the highest index of agreement (d = 0.81). This result was confirmed by the spatial comparison of the absolute differences (errors) between model predictions and observations using GBM and RF in the study area. In summary, the results show that as a group, the bagging method and the boosting method performed equally well, and the stacking method was third for the erosion pin dataset considered in this study.

Download Full-text

MODC: A Pareto-Optimal Optimization Approach for Network Traffic Classification Based on the Divide and Conquer Strategy

Information ◽

10.3390/info9090233 ◽

2018 ◽

Vol 9 (9) ◽

pp. 233 ◽

Cited By ~ 1

Author(s):

Zuleika Nascimento ◽

Djamel Sadok

Keyword(s):

Machine Learning ◽

Network Traffic ◽

Machine Learning Algorithms ◽

Divide And Conquer ◽

Pareto Optimal ◽

Optimization Approach ◽

Traffic Classification ◽

Multi Objective ◽

Network Traffic Classification ◽

Changes Over Time

Network traffic classification aims to identify categories of traffic or applications of network packets or flows. It is an area that continues to gain attention by researchers due to the necessity of understanding the composition of network traffics, which changes over time, to ensure the network Quality of Service (QoS). Among the different methods of network traffic classification, the payload-based one (DPI) is the most accurate, but presents some drawbacks, such as the inability of classifying encrypted data, the concerns regarding the users’ privacy, the high computational costs, and ambiguity when multiple signatures might match. For that reason, machine learning methods have been proposed to overcome these issues. This work proposes a Multi-Objective Divide and Conquer (MODC) model for network traffic classification, by combining, into a hybrid model, supervised and unsupervised machine learning algorithms, based on the divide and conquer strategy. Additionally, it is a flexible model since it allows network administrators to choose between a set of parameters (pareto-optimal solutions), led by a multi-objective optimization process, by prioritizing flow or byte accuracies. Our method achieved 94.14% of average flow accuracy for the analyzed dataset, outperforming the six DPI-based tools investigated, including two commercial ones, and other machine learning-based methods.

Download Full-text

Iterative-tuning support vector machine for network traffic classification

2015 IFIP/IEEE International Symposium on Integrated Network Management (IM) ◽

10.1109/inm.2015.7140323 ◽

2015 ◽

Cited By ~ 3

Author(s):

Yang Hong ◽

Changcheng Huang ◽

Biswajit Nandy ◽

Nabil Seddigh

Keyword(s):

Support Vector Machine ◽

Network Traffic ◽

Support Vector ◽

Traffic Classification ◽

Network Traffic Classification

Download Full-text

A Clinical-Radiomics Nomogram Based on Computed Tomography for Predicting Risk of Local Recurrence After Radiotherapy in Nasopharyngeal Carcinoma

Frontiers in Oncology ◽

10.3389/fonc.2021.637687 ◽

2021 ◽

Vol 11 ◽

Author(s):

Chaohua Zhu ◽

Huixian Huang ◽

Xu Liu ◽

Hao Chen ◽

Hailan Jiang ◽

...

Keyword(s):

Machine Learning ◽

Computed Tomography ◽

Nasopharyngeal Carcinoma ◽

Local Recurrence ◽

Stochastic Gradient Descent ◽

Support Vector ◽

Stable Model ◽

Clinical Factors ◽

Specificity And Sensitivity ◽

Radiomics Signature

Purpose: We aimed to establish a nomogram model based on computed tomography (CT) imaging radiomic signature and clinical factors to predict the risk of local recurrence in nasopharyngeal carcinoma (NPC) after intensity-modulated radiotherapy (IMRT).Methods: This was a retrospective study consisting of 156 NPC patients treated with IMRT. Radiomics features were extracted from the gross tumor volume for nasopharynx (GTVnx) in pretreatment CT images for patients with or without local recurrence. Discriminative radiomics features were selected after t-test and the least absolute shrinkage and selection operator (LASSO) analysis. The most stable model was obtained to generate radiomics signature (Rad_Score) by using machine learning models including Logistic Regression, K-Nearest neighbor, Naive Bayes, Decision Tree, Stochastic Gradient Descent, Gradient Booting Tree and Linear Support Vector Classification. A nomogram for local recurrence was established based on Rad_Score and clinical factors. The predictive performance of nomogram was evaluated by discrimination ability and calibration ability. Decision Curve Analysis (DCA) was used to evaluate the clinical benefits of the multi-factor nomogram in predicting local recurrence after IMRT.Results: Local recurrence occurred in 42 patients. A total of 1,452 radiomics features were initially extracted and seven stable features finally selected after LASSO analysis were used for machine learning algorithm modeling to generate Rad_Score. The nomogram showed that the greater Rad_Score was associated with the higher risk of local recurrence. The concordance index, specificity and sensitivity in the training cohort were 0.931 (95%CI:0.8765–0.9856), 91.2 and 82.8%, respectively; whereas, in the validation cohort, they were 0.799 (95%CI: 0.6458–0.9515), 79.4, and 69.2%, respectively.Conclusion: The nomogram based on radiomics signature and clinical factors can predict the risk of local recurrence after IMRT in patients with NPC and provide evidence for early clinical intervention.

Download Full-text

An Approach Based on the Improved SVM Algorithm for Identifying Malware in Network Traffic

Security and Communication Networks ◽

10.1155/2021/5518909 ◽

2021 ◽

Vol 2021 ◽

pp. 1-14

Author(s):

Bo Liu ◽

Jinfu Chen ◽

Songling Qin ◽

Zufa Zhang ◽

Yisong Liu ◽

...

Keyword(s):

Network Traffic ◽

Cyber Security ◽

False Positive ◽

False Positive Rate ◽

Support Vector ◽

Traffic Classification ◽

Svm Algorithm ◽

Network Traffic Classification ◽

Positive Rate ◽

Function Selection

Due to the growth and popularity of the internet, cyber security remains, and will continue, to be an important issue. There are many network traffic classification methods or malware identification approaches that have been proposed to solve this problem. However, the existing methods are not well suited to help security experts effectively solve this challenge due to their low accuracy and high false positive rate. To this end, we employ a machine learning-based classification approach to identify malware. The approach extracts features from network traffic and reduces the dimensionality of the features, which can effectively improve the accuracy of identification. Furthermore, we propose an improved SVM algorithm for classifying the network traffic dubbed Optimized Facile Support Vector Machine (OFSVM). The OFSVM algorithm solves the problem that the original SVM algorithm is not satisfactory for classification from two aspects, i.e., parameter optimization and kernel function selection. Therefore, in this paper, we present an approach for identifying malware in network traffic, called Network Traffic Malware Identification (NTMI). To evaluate the effectiveness of the NTMI approach proposed in this paper, we collect four real network traffic datasets and use a publicly available dataset CAIDA for our experiments. Evaluation results suggest that the NTMI approach can lead to higher accuracy while achieving a lower false positive rate compared with other identification methods. On average, the NTMI approach achieves an accuracy of 92.5% and a false positive rate of 5.527%.

Download Full-text

Classifying relevant video tutorials for the school’s learning management system using support vector machine algorithm

10.31219/osf.io/scz4r ◽

2020 ◽

Author(s):

Castro Mayleen Dorcas Bondoc ◽

Tumibay Gilbert Malawit

Keyword(s):

Machine Learning ◽

Support Vector Machine ◽

Learning Process ◽

Learning Algorithm ◽

Research Work ◽

Supervised Machine Learning ◽

Support Vector ◽

Video Tutorials ◽

Learning Management ◽

Face To Face Instruction

Today many schools, universities and institutions recognize the necessity and importance of using Learning Management Systems (LMS) as part of their educational services. This research work has applied LMS in the teaching and learning process of Bulacan State University (BulSU) Graduate School (GS) Program that enhances the face-to-face instruction with online components. The researchers uses an LMS that provides educators a platform that can motivate and engage students to new educational environment through manage online classes. The LMS allows educators to distribute information, manage learning materials, assignments, quizzes, and communications. Aside from the basic functions of the LMS, the researchers uses Machine Learning (ML) Algorithms applying Support Vector Machine (SVM) that will classify and identify the best related videos per topic. SVM is a supervised machine learning algorithm that analyzes data for classification and regression analysis by Maity [1]. The results of this study showed that integration of video tutorials in LMS can significantly contribute knowledge and skills in the learning process of the students.

Download Full-text

Network traffic analysis using Machine Learning Techniques in IoT Network

International Journal of Software Innovation ◽

10.4018/ijsi.289172 ◽

2021 ◽

Vol 9 (4) ◽

pp. 0-0

Keyword(s):

Machine Learning ◽

Support Vector Machine ◽

Network Traffic ◽

Traffic Analysis ◽

Machine Learning Techniques ◽

Support Vector ◽

Static And Dynamic Analysis ◽

Network Traffic Analysis ◽

Cyber Threats ◽

Network Topologies

Internet of things devices are not very intelligent and resource-constrained; thus, they are vulnerable to cyber threats. Cyber threats would become potentially harmful and lead to infecting the machines, disrupting the network topologies, and denying services to their legitimate users. Artificial intelligence-driven methods and advanced machine learning-based network investigation prevent the network from malicious traffics. In this research, a support vector machine learning technique was used to classify normal and abnormal traffic. Network traffic analysis has been done to detect and prevent the network from malicious traffic. Static and dynamic analysis of malware has been done. Mininet emulator was selected for network design, VMware fusion for creating a virtual environment, hosting OS was Ubuntu Linux, network topology was a tree topology. Wireshark was used to open an existing pcap file that contains network traffic. The support vector machine classifier demonstrated the best performance with 99% accuracy.

Download Full-text