Evaluating Feature Selection Methods for Network Intrusion Detection with Kyoto Data

Considering the large quantity of the data flowing through the network routers, there is a very high demand to detect malicious and unhealthy network traffic to provide network users with reliable network operation and security of their information. Predictive models should be built to identify whether a network traffic record is healthy or malicious. To build such models, machine learning methods have started to be used for the task of network intrusion detection. Such predictive models must monitor and analyze a large amount of network data in a reasonable amount of time (usually real time). To do so, they cannot always process the whole data and there is a need for data reduction methods, which reduce the amount of data that needs to be processed. Feature selection is one of the data reduction methods that can be used to decrease the process time. It is important to understand which features are most relevant to determining if a network traffic record is malicious and avoid using the whole feature set to make the processing time more efficient. Also it is important that the simple model built from the reduced feature set be as effective as a model which uses all the features. Considering these facts, feature selection is a very important pre-processing step in the detection of network attacks. The goal is to remove irrelevant and redundant features in order to increase the overall effectiveness of an intrusion detection system without negatively affecting the classification performance. Most of the previous feature selection studies in the area of intrusion detection have been applied on the KDD 99 dataset. As KDD 99 is an outdated dataset, in this paper, we compare different feature selection methods on a relatively new dataset, called Kyoto 2006+. There is no comprehensive comparison of different feature selection approaches for this dataset. In the present work, we study four filter-based feature selection methods which are chosen from two categories for the application of network intrusion detection. Three filter-based feature rankers and one filter-based subset evaluation technique are compared together along with the null case which applies no feature selection. We also apply statistical analysis to determine whether performance differences between these feature selection methods are significant or not. We find that among all the feature selection methods, Signal-to-Noise (S2N) gives the best performance results. It also outperforms no feature selection approach in all the experiments.

Download Full-text

A Feature Selection Approach for Network Intrusion Detection Based on Tree-Seed Algorithm and K-Nearest Neighbor

2018 IEEE 4th International Symposium on Wireless Systems within the International Conferences on Intelligent Data Acquisition and Advanced Computing Systems (IDAACS-SWS) ◽

10.1109/idaacs-sws.2018.8525522 ◽

2018 ◽

Cited By ~ 3

Author(s):

Feng Chen ◽

Zhiwei Ye ◽

Chunzhi Wang ◽

Lingyu Yan ◽

Ruoxi Wang

Keyword(s):

Feature Selection ◽

Intrusion Detection ◽

Nearest Neighbor ◽

Network Intrusion Detection ◽

K Nearest Neighbor ◽

Network Intrusion ◽

Selection Approach ◽

Feature Selection Approach ◽

Tree Seed

Download Full-text

A Naive Feature Selection Method and Its Application in Network Intrusion Detection

2010 International Conference on Computational Intelligence and Security ◽

10.1109/cis.2010.96 ◽

2010 ◽

Cited By ~ 1

Author(s):

Tieming Chen ◽

Xiaoming Pan ◽

Yiguang Xuan ◽

Jixia Ma ◽

Jie Jiang

Keyword(s):

Feature Selection ◽

Intrusion Detection ◽

Feature Selection Method ◽

Selection Method ◽

Network Intrusion Detection ◽

Network Intrusion

Download Full-text

Influence Analysis of Feature Selection to Network Intrusion Detection System Performance Using NSL-KDD Dataset

2019 International Conference on Computer Science, Information Technology, and Electrical Engineering (ICOMITEE) ◽

10.1109/icomitee.2019.8920961 ◽

2019 ◽

Cited By ~ 2

Author(s):

Lukman Hakim ◽

Rahilla Fatma ◽

Novriandi

Keyword(s):

Feature Selection ◽

Intrusion Detection ◽

System Performance ◽

Intrusion Detection System ◽

Detection System ◽

Network Intrusion Detection ◽

Influence Analysis ◽

Network Intrusion ◽

Network Intrusion Detection System

Download Full-text

A Feature Selection Model for Network Intrusion Detection System Based on PSO, GWO, FFA and GA Algorithms

Symmetry ◽

10.3390/sym12061046 ◽

2020 ◽

Vol 12 (6) ◽

pp. 1046 ◽

Cited By ~ 3

Author(s):

Omar Almomani

Keyword(s):

Genetic Algorithm ◽

Feature Selection ◽

Intrusion Detection ◽

Intrusion Detection System ◽

Detection System ◽

Selection Model ◽

Network Intrusion Detection ◽

Network Intrusion ◽

Proposed Model ◽

Positive Rate

The network intrusion detection system (NIDS) aims to identify virulent action in a network. It aims to do that through investigating the traffic network behavior. The approaches of data mining and machine learning (ML) are extensively used in the NIDS to discover anomalies. Regarding feature selection, it plays a significant role in improving the performance of NIDSs. That is because anomaly detection employs a great number of features that require much time. Therefore, the feature selection approach affects the time needed to investigate the traffic behavior and improve the accuracy level. The researcher of the present study aimed to propose a feature selection model for NIDSs. This model is based on the particle swarm optimization (PSO), grey wolf optimizer (GWO), firefly optimization (FFA) and genetic algorithm (GA). The proposed model aims at improving the performance of NIDSs. The proposed model deploys wrapper-based methods with the GA, PSO, GWO and FFA algorithms for selecting features using Anaconda Python Open Source, and deploys filtering-based methods for the mutual information (MI) of the GA, PSO, GWO and FFA algorithms that produced 13 sets of rules. The features derived from the proposed model are evaluated based on the support vector machine (SVM) and J48 ML classifiers and the UNSW-NB15 dataset. Based on the experiment, Rule 13 (R13) reduces the features into 30 features. Rule 12 (R12) reduces the features into 13 features. Rule 13 and Rule 12 offer the best results in terms of F-measure, accuracy and sensitivity. The genetic algorithm (GA) shows good results in terms of True Positive Rate (TPR) and False Negative Rate (FNR). As for Rules 11, 9 and 8, they show good results in terms of False Positive Rate (FPR), while PSO shows good results in terms of precision and True Negative Rate (TNR). It was found that the intrusion detection system with fewer features will increase accuracy. The proposed feature selection model for NIDS is rule-based pattern recognition to discover computer network attack which is in the scope of Symmetry journal.

Download Full-text

Bootstrap-based homogeneous ensemble feature selection for network intrusion detection system

Developments of Artificial Intelligence Technologies in Computation and Robotics ◽

10.1142/9789811223334_0004 ◽

2020 ◽

Author(s):

Yeshalem Gezahegn Damtew ◽

Hongmei Chen ◽

Burhan Mohi Yu Din

Keyword(s):

Feature Selection ◽

Intrusion Detection ◽

Intrusion Detection System ◽

Detection System ◽

Network Intrusion Detection ◽

Network Intrusion ◽

Network Intrusion Detection System ◽

Selection For ◽

Homogeneous Ensemble

Download Full-text

Network intrusion detection based on deep learning model optimized with rule-based hybrid feature selection

Information Security Journal A Global Perspective ◽

10.1080/19393555.2020.1767240 ◽

2020 ◽

Vol 29 (6) ◽

pp. 267-283

Author(s):

Femi Emmanuel Ayo ◽

Sakinat Oluwabukonla Folorunso ◽

Adebayo A. Abayomi-Alli ◽

Adebola Olayinka Adekunle ◽

Joseph Bamidele Awotunde

Keyword(s):

Feature Selection ◽

Deep Learning ◽

Intrusion Detection ◽

Learning Model ◽

Network Intrusion Detection ◽

Rule Based ◽

Network Intrusion ◽

Deep Learning Model

Download Full-text

Anomaly-Based Network Intrusion Detection System through Feature Selection and Hybrid Machine Learning Technique

2018 16th International Conference on ICT and Knowledge Engineering (ICT&KE) ◽

10.1109/ictke.2018.8612331 ◽

2018 ◽

Cited By ~ 3

Author(s):

Apichit Pattawaro ◽

Chantri Polprasert

Keyword(s):

Machine Learning ◽

Feature Selection ◽

Intrusion Detection ◽

Intrusion Detection System ◽

Detection System ◽

Network Intrusion Detection ◽

Machine Learning Technique ◽

Network Intrusion ◽

Learning Technique ◽

Hybrid Machine

Download Full-text

A Hybrid Feature Selection Framework for Enhancing Network Intrusion Detection

Asian Journal of Research in Social Sciences and Humanities ◽

10.5958/2249-7315.2017.00216.7 ◽

2017 ◽

Vol 7 (3) ◽

pp. 909

Author(s):

J. Rene Beulah ◽

D. Shalini Punithavathani

Keyword(s):

Feature Selection ◽

Intrusion Detection ◽

Network Intrusion Detection ◽

Network Intrusion ◽

Selection Framework

Download Full-text

CLUSTERING-BASED NETWORK INTRUSION DETECTION

International Journal of Reliability Quality and Safety Engineering ◽

10.1142/s0218539307002568 ◽

2007 ◽

Vol 14 (02) ◽

pp. 169-187 ◽

Cited By ~ 55

Author(s):

SHI ZHONG ◽

TAGHI M. KHOSHGOFTAAR ◽

NAEEM SELIYA

Keyword(s):

Data Mining ◽

Network Security ◽

Intrusion Detection ◽

Unsupervised Learning ◽

Network Traffic ◽

Clustering Algorithms ◽

Network Intrusion Detection ◽

Learning Methods ◽

High Detection Rate ◽

Network Intrusion

Recently data mining methods have gained importance in addressing network security issues, including network intrusion detection — a challenging task in network security. Intrusion detection systems aim to identify attacks with a high detection rate and a low false alarm rate. Classification-based data mining models for intrusion detection are often ineffective in dealing with dynamic changes in intrusion patterns and characteristics. Consequently, unsupervised learning methods have been given a closer look for network intrusion detection. We investigate multiple centroid-based unsupervised clustering algorithms for intrusion detection, and propose a simple yet effective self-labeling heuristic for detecting attack and normal clusters of network traffic audit data. The clustering algorithms investigated include, k-means, Mixture-Of-Spherical Gaussians, Self-Organizing Map, and Neural-Gas. The network traffic datasets provided by the DARPA 1998 offline intrusion detection project are used in our empirical investigation, which demonstrates the feasibility and promise of unsupervised learning methods for network intrusion detection. In addition, a comparative analysis shows the advantage of clustering-based methods over supervised classification techniques in identifying new or unseen attack types.

Download Full-text

An Effective Feature Selection Approach for Network Intrusion Detection

2013 IEEE Eighth International Conference on Networking, Architecture and Storage ◽

10.1109/nas.2013.49 ◽

2013 ◽

Cited By ~ 21

Author(s):

Fengli Zhang ◽

Dan Wang

Keyword(s):

Feature Selection ◽

Intrusion Detection ◽

Network Intrusion Detection ◽

Network Intrusion ◽

Selection Approach ◽

Feature Selection Approach

Download Full-text