Multi Class SVM Algorithm with Active Learning for Network Traffic Classification

2021 ◽  
pp. 114885
Author(s):  
Shi Dong
2021 ◽  
Vol 2021 ◽  
pp. 1-14
Author(s):  
Bo Liu ◽  
Jinfu Chen ◽  
Songling Qin ◽  
Zufa Zhang ◽  
Yisong Liu ◽  
...  

Due to the growth and popularity of the internet, cyber security remains, and will continue, to be an important issue. There are many network traffic classification methods or malware identification approaches that have been proposed to solve this problem. However, the existing methods are not well suited to help security experts effectively solve this challenge due to their low accuracy and high false positive rate. To this end, we employ a machine learning-based classification approach to identify malware. The approach extracts features from network traffic and reduces the dimensionality of the features, which can effectively improve the accuracy of identification. Furthermore, we propose an improved SVM algorithm for classifying the network traffic dubbed Optimized Facile Support Vector Machine (OFSVM). The OFSVM algorithm solves the problem that the original SVM algorithm is not satisfactory for classification from two aspects, i.e., parameter optimization and kernel function selection. Therefore, in this paper, we present an approach for identifying malware in network traffic, called Network Traffic Malware Identification (NTMI). To evaluate the effectiveness of the NTMI approach proposed in this paper, we collect four real network traffic datasets and use a publicly available dataset CAIDA for our experiments. Evaluation results suggest that the NTMI approach can lead to higher accuracy while achieving a lower false positive rate compared with other identification methods. On average, the NTMI approach achieves an accuracy of 92.5% and a false positive rate of 5.527%.


Information ◽  
2018 ◽  
Vol 9 (9) ◽  
pp. 233 ◽  
Author(s):  
Zuleika Nascimento ◽  
Djamel Sadok

Network traffic classification aims to identify categories of traffic or applications of network packets or flows. It is an area that continues to gain attention by researchers due to the necessity of understanding the composition of network traffics, which changes over time, to ensure the network Quality of Service (QoS). Among the different methods of network traffic classification, the payload-based one (DPI) is the most accurate, but presents some drawbacks, such as the inability of classifying encrypted data, the concerns regarding the users’ privacy, the high computational costs, and ambiguity when multiple signatures might match. For that reason, machine learning methods have been proposed to overcome these issues. This work proposes a Multi-Objective Divide and Conquer (MODC) model for network traffic classification, by combining, into a hybrid model, supervised and unsupervised machine learning algorithms, based on the divide and conquer strategy. Additionally, it is a flexible model since it allows network administrators to choose between a set of parameters (pareto-optimal solutions), led by a multi-objective optimization process, by prioritizing flow or byte accuracies. Our method achieved 94.14% of average flow accuracy for the analyzed dataset, outperforming the six DPI-based tools investigated, including two commercial ones, and other machine learning-based methods.


Sign in / Sign up

Export Citation Format

Share Document