A performance comparison of feature selection techniques with SVM for network anomaly detection

Variable selection (also known as feature selection ) is essential to optimize the learning complexity by prioritizing features, particularly for a massive, high-dimensional dataset like network traffic data. In reality, however, it is not an easy task to effectively perform the feature selection despite the availability of the existing selection techniques. From our initial experiments, we observed that the existing selection techniques produce different sets of features even under the same condition (e.g., a static size for the resulted set). In addition, individual selection techniques perform inconsistently, sometimes showing better performance but sometimes worse than others, thereby simply relying on one of them would be risky for building models using the selected features. More critically, it is demanding to automate the selection process, since it requires laborious efforts with intensive analysis by a group of experts otherwise. In this article, we explore challenges in the automated feature selection with the application of network anomaly detection. We first present our ensemble approach that benefits from the existing feature selection techniques by incorporating them, and one of the proposed ensemble techniques based on greedy search works highly consistently showing comparable results to the existing techniques. We also address the problem of when to stop to finalize the feature elimination process and present a set of methods designed to determine the number of features for the reduced feature set. Our experimental results conducted with two recent network datasets show that the identified feature sets by the presented ensemble and stopping methods consistently yield comparable performance with a smaller number of features to conventional selection techniques.

Download Full-text

Performance Comparison of Several Feature Selection Techniques for Offline Handwritten Character Recognition

2018 International Conference on Research in Intelligent and Computing in Engineering (RICE) ◽

10.1109/rice.2018.8509076 ◽

2018 ◽

Cited By ~ 2

Author(s):

Munish Kumar ◽

M. K. Jindal ◽

R. K. Sharma ◽

Simpel RaniJindal

Keyword(s):

Feature Selection ◽

Character Recognition ◽

Performance Comparison ◽

Handwritten Character Recognition ◽

Handwritten Character ◽

Feature Selection Techniques

Download Full-text

Multi-criteria decision support for feature selection in network anomaly detection system

International Journal of Data Analysis Techniques and Strategies ◽

10.1504/ijdats.2018.094132 ◽

2018 ◽

Vol 10 (3) ◽

pp. 334 ◽

Cited By ~ 1

Author(s):

C. Seelammal ◽

K. Vimala Devi

Keyword(s):

Feature Selection ◽

Decision Support ◽

Anomaly Detection ◽

Detection System ◽

Network Anomaly Detection ◽

Anomaly Detection System

Download Full-text

Feature selection by multi-objective optimisation: Application to network anomaly detection by hierarchical self-organising maps

Knowledge-Based Systems ◽

10.1016/j.knosys.2014.08.013 ◽

2014 ◽

Vol 71 ◽

pp. 322-338 ◽

Cited By ~ 82

Author(s):

Emiro de la Hoz ◽

Eduardo de la Hoz ◽

Andrés Ortiz ◽

Julio Ortega ◽

Antonio Martínez-Álvarez

Keyword(s):

Feature Selection ◽

Anomaly Detection ◽

Multi Objective ◽

Network Anomaly Detection

Download Full-text

A New Design of Custom Optimized Cnn-Lstm Assists to Detect Network Anomaly Using Categorical Data

10.21203/rs.3.rs-490866/v1 ◽

2021 ◽

Author(s):

Kanmani R ◽

A.Christy Jeba Malar ◽

Roopa V ◽

Ranjani D ◽

Suganya R

Keyword(s):

Feature Selection ◽

Anomaly Detection ◽

Categorical Data ◽

Imbalanced Data ◽

Experimental Result ◽

Training Dataset ◽

Detection Model ◽

Effective Performance ◽

Network Anomaly Detection ◽

System Effectiveness

Abstract For traditional intrusion detection model, the system effectiveness is fully based on training dataset and feature selection. During feature selection, it needs more labour charge and trusted mainly on expert’s knowledge. Moreover, the training dataset contains more imbalanced data which in terms model tends to be biased. Here, an automatic approach is introduced to correct deficiency in the system. In this paper, the author proposes novel network anomaly detection (NID) build using categorical data. A model has to be designed with modified form of deep neural network primarily utilized for detecting anomaly within the network. Custom CNN-LSTM with Harris Hawks Optimization (named as custom optimized CNN-LSTM) is designed as a new classifier majorly used to detect the anomaly from word cloud to distinguish the data with effective performance. The experimental result shows that the proposed method achieves a promising output for network anomaly detection.

Download Full-text