A performance comparison of feature selection techniques with SVM for network anomaly detection

Author(s):  
Yonchanok Khaokaew ◽  
Tanapat Anusas-amornkul
2021 ◽  
Vol 12 (3) ◽  
pp. 1-28
Author(s):  
Makiya Nakashima ◽  
Alex Sim ◽  
Youngsoo Kim ◽  
Jonghyun Kim ◽  
Jinoh Kim

Variable selection (also known as feature selection ) is essential to optimize the learning complexity by prioritizing features, particularly for a massive, high-dimensional dataset like network traffic data. In reality, however, it is not an easy task to effectively perform the feature selection despite the availability of the existing selection techniques. From our initial experiments, we observed that the existing selection techniques produce different sets of features even under the same condition (e.g., a static size for the resulted set). In addition, individual selection techniques perform inconsistently, sometimes showing better performance but sometimes worse than others, thereby simply relying on one of them would be risky for building models using the selected features. More critically, it is demanding to automate the selection process, since it requires laborious efforts with intensive analysis by a group of experts otherwise. In this article, we explore challenges in the automated feature selection with the application of network anomaly detection. We first present our ensemble approach that benefits from the existing feature selection techniques by incorporating them, and one of the proposed ensemble techniques based on greedy search works highly consistently showing comparable results to the existing techniques. We also address the problem of when to stop to finalize the feature elimination process and present a set of methods designed to determine the number of features for the reduced feature set. Our experimental results conducted with two recent network datasets show that the identified feature sets by the presented ensemble and stopping methods consistently yield comparable performance with a smaller number of features to conventional selection techniques.


2014 ◽  
Vol 71 ◽  
pp. 322-338 ◽  
Author(s):  
Emiro de la Hoz ◽  
Eduardo de la Hoz ◽  
Andrés Ortiz ◽  
Julio Ortega ◽  
Antonio Martínez-Álvarez

2021 ◽  
Author(s):  
Kanmani R ◽  
A.Christy Jeba Malar ◽  
Roopa V ◽  
Ranjani D ◽  
Suganya R

Abstract For traditional intrusion detection model, the system effectiveness is fully based on training dataset and feature selection. During feature selection, it needs more labour charge and trusted mainly on expert’s knowledge. Moreover, the training dataset contains more imbalanced data which in terms model tends to be biased. Here, an automatic approach is introduced to correct deficiency in the system. In this paper, the author proposes novel network anomaly detection (NID) build using categorical data. A model has to be designed with modified form of deep neural network primarily utilized for detecting anomaly within the network. Custom CNN-LSTM with Harris Hawks Optimization (named as custom optimized CNN-LSTM) is designed as a new classifier majorly used to detect the anomaly from word cloud to distinguish the data with effective performance. The experimental result shows that the proposed method achieves a promising output for network anomaly detection.


IEEE Access ◽  
2020 ◽  
Vol 8 ◽  
pp. 116216-116225 ◽  
Author(s):  
Jiewen Mao ◽  
Yongquan Hu ◽  
Dong Jiang ◽  
Tongquan Wei ◽  
Fuke Shen

Sign in / Sign up

Export Citation Format

Share Document