Extracting salient features for network intrusion detection using machine learning methods

This work presents a data preprocessing and feature selection framework to support data mining and network security experts in minimal feature set selection of intrusion detection data. This process is supported by detailed visualisation and examination of class distributions. Distribution histograms, scatter plots and information gain are presented as supportive feature reduction tools. The feature reduction process applied is based on decision tree pruning and backward elimination. This paper starts with an analysis of the KDD Cup '99 datasets and their potential for feature reduction. The dataset consists of connection records with 41 features whose relevance for intrusion detection are not clear. All traffic is either classified `normal' or into the four attack types denial-of-service, network probe, remote-to-local or user-to-root. Using our custom feature selection process, we show how we can significantly reduce the number features in the dataset to a few salient features. We conclude by presenting minimal sets with 4--8 salient features for two-class and multi-class categorisation for detecting intrusions, as well as for the detection of individual attack classes; the performance using a static classifier compares favourably to the performance using all features available. The suggested process is of general nature and can be applied to any similar dataset.

Download Full-text

EFS-LSTM (Ensemble-Based Feature Selection With LSTM) Classifier for Intrusion Detection System

International Journal of e-Collaboration ◽

10.4018/ijec.2020100106 ◽

2020 ◽

Vol 16 (4) ◽

pp. 72-86

Author(s):

Preethi D. ◽

Neelu Khare

Keyword(s):

Feature Selection ◽

Intrusion Detection ◽

Performance Metrics ◽

Short Term Memory ◽

Information Gain ◽

Detection System ◽

Classification Performance ◽

Performance Study ◽

Chi Square ◽

Network Intrusion

In this article, an EFS-LSTM, a deep recurrent learning model, is proposed for network intrusion detection systems. The EFS-LSTM model uses ensemble-based feature selection (EFS) and LSTM (Long Short Term Memory) for the classification of network intrusions. The EFS combines five feature selection mechanisms namely, information gain, gain ratio, chi-square, correlation-based feature selection, and symmetric uncertainty-based feature selection. The experiments were conducted using the benchmark NSL-KDD dataset and implemented using Tensor flow and python. The EFS-LSTM classifier is evaluated using the classification performance metrics and also compared with all the 41 features without any feature selection as well as with each individual feature selection techniques and classified using LSTM. The performance study showed that the EFS-LSTM model outperforms better with 99.8% accuracy with a higher detection and less false alarm rates.

Download Full-text

High-Performance Feature Selection Model for Network Intrusion Detection System

International Journal of Engineering and Advanced Technology - Regular Issue ◽

10.35940/ijeat.f1294.0986s319 ◽

2019 ◽

Vol 8 (6S3) ◽

pp. 1595-1597

Keyword(s):

Feature Selection ◽

Intrusion Detection ◽

Intrusion Detection System ◽

Detection System ◽

Feature Selection Method ◽

Reduction Process ◽

Network Intrusion Detection ◽

Network Intrusion ◽

Network Intrusion Detection System ◽

Network Intrusions

Network intrusions detection is a continuous vigilant task and to efficiently analyze the traffic in the corporate network to detect network intrusions. The efficiency of the Network Intrusion Detection System (NIDS) performance can be improved by adopting feature selection or reduction process to suit the present day high speed real time networks. This work is focused on identifying the key features of the audit dataset used to build an efficient light-weight NIDS. The NSL KDD dataset is used in this work titled Attribute Richness Based Feature Selection (ARFS) in order to analyze its performance.The obtained results are compared with the Correlation-based Feature Selection (CFS) and Information Gain (IG) feature selection methods. The proposed feature selection method produced better detection rate comparatively.

Download Full-text

Analysis of Feature Selection and Ensemble Classifier Methods for Intrusion Detection

International Journal of Natural Computing Research ◽

10.4018/ijncr.2018010104 ◽

2018 ◽

Vol 7 (1) ◽

pp. 57-72

Author(s):

H.P. Vinutha ◽

Poornima Basavaraju

Keyword(s):

Feature Selection ◽

Intrusion Detection ◽

Detection Rate ◽

Information Gain ◽

False Positive Rate ◽

Ensemble Classifier ◽

Ensemble Classification ◽

Chi Square ◽

Traffic Pattern ◽

Data Mining Algorithms

Day by day network security is becoming more challenging task. Intrusion detection systems (IDSs) are one of the methods used to monitor the network activities. Data mining algorithms play a major role in the field of IDS. NSL-KDD'99 dataset is used to study the network traffic pattern which helps us to identify possible attacks takes place on the network. The dataset contains 41 attributes and one class attribute categorized as normal, DoS, Probe, R2L and U2R. In proposed methodology, it is necessary to reduce the false positive rate and improve the detection rate by reducing the dimensionality of the dataset, use of all 41 attributes in detection technology is not good practices. Four different feature selection methods like Chi-Square, SU, Gain Ratio and Information Gain feature are used to evaluate the attributes and unimportant features are removed to reduce the dimension of the data. Ensemble classification techniques like Boosting, Bagging, Stacking and Voting are used to observe the detection rate separately with three base algorithms called Decision stump, J48 and Random forest.

Download Full-text

A Feature Selection Approach for Network Intrusion Detection Based on Tree-Seed Algorithm and K-Nearest Neighbor

2018 IEEE 4th International Symposium on Wireless Systems within the International Conferences on Intelligent Data Acquisition and Advanced Computing Systems (IDAACS-SWS) ◽

10.1109/idaacs-sws.2018.8525522 ◽

2018 ◽

Cited By ~ 3

Author(s):

Feng Chen ◽

Zhiwei Ye ◽

Chunzhi Wang ◽

Lingyu Yan ◽

Ruoxi Wang

Keyword(s):

Feature Selection ◽

Intrusion Detection ◽

Nearest Neighbor ◽

Network Intrusion Detection ◽

K Nearest Neighbor ◽

Network Intrusion ◽

Selection Approach ◽

Feature Selection Approach ◽

Tree Seed

Download Full-text

A Naive Feature Selection Method and Its Application in Network Intrusion Detection

2010 International Conference on Computational Intelligence and Security ◽

10.1109/cis.2010.96 ◽

2010 ◽

Cited By ~ 1

Author(s):

Tieming Chen ◽

Xiaoming Pan ◽

Yiguang Xuan ◽

Jixia Ma ◽

Jie Jiang

Keyword(s):

Feature Selection ◽

Intrusion Detection ◽

Feature Selection Method ◽

Selection Method ◽

Network Intrusion Detection ◽

Network Intrusion

Download Full-text

A feature selection algorithm combining information gain and multi-objective genetic search for intrusion detection system

MATEC Web of Conferences ◽

10.1051/matecconf/202133608008 ◽

2021 ◽

Vol 336 ◽

pp. 08008

Author(s):

Tao Xie

Keyword(s):

Feature Selection ◽

Intrusion Detection ◽

Intrusion Detection System ◽

Detection Rate ◽

Information Gain ◽

Detection System ◽

Selection Algorithm ◽

Feature Selection Algorithm ◽

Genetic Search ◽

Multi Objective

In order to improve the detection rate and speed of intrusion detection system, this paper proposes a feature selection algorithm. The algorithm uses information gain to rank the features in descending order, and then uses a multi-objective genetic algorithm to gradually search the ranking features to find the optimal feature combination. We classified the Kddcup98 dataset into five classes, DOS, PROBE, R2L, and U2R, and conducted numerous experiments on each class. Experimental results show that for each class of attack, the proposed algorithm can not only speed up the feature selection, but also significantly improve the detection rate of the algorithm.

Download Full-text

Scrutinizing Attacks and Evaluating Performance Appraisal Parameters via Feature Selection in Intrusion Detection System

10.21203/rs.3.rs-748765/v1 ◽

2021 ◽

Author(s):

Navroop Kaur ◽

Meenakshi Bansal ◽

Sukhwinder Singh S

Keyword(s):

Feature Selection ◽

Performance Evaluation ◽

Intrusion Detection ◽

Intrusion Detection System ◽

Detection System ◽

Denial Of Service ◽

Cyber Attacks ◽

Support Vector ◽

K Nearest Neighbor ◽

Evaluation Parameters

Abstract In modern times the firewall and antivirus packages are not good enough to protect the organization from numerous cyber attacks. Computer IDS (Intrusion Detection System) is a crucial aspect that contributes to the success of an organization. IDS is a software application responsible for scanning organization networks for suspicious activities and policy rupturing. IDS ensures the secure and reliable functioning of the network within an organization. IDS underwent huge transformations since its origin to cope up with the advancing computer crimes. The primary motive of IDS has been to augment the competence of detecting the attacks without endangering the performance of the network. The research paper elaborates on different types and different functions performed by the IDS. The NSL KDD dataset has been considered for training and testing. The seven prominent classifiers LR (Logistic Regression), NB (Naïve Bayes), DT (Decision Tree), AB (AdaBoost), RF (Random Forest), kNN (k Nearest Neighbor), and SVM (Support Vector Machine) have been studied along with their pros and cons and the feature selection have been imposed to enhance the reading of performance evaluation parameters (Accuracy, Precision, Recall, and F1Score). The paper elaborates a detailed flowchart and algorithm depicting the procedure to perform feature selection using XGB (Extreme Gradient Booster) for four categories of attacks: DoS (Denial of Service), Probe, R2L (Remote to Local Attack), and U2R (User to Root Attack). The selected features have been ranked as per their occurrence. The implementation have been conducted at five different ratios of 60-40%, 70-30%, 90-10%, 50-50%, and 80-20%. Different classifiers scored best for different performance evaluation parameters at different ratios. NB scored with the best Accuracy and Recall values. DT and RF consistently performed with high accuracy. NB, SVM, and kNN achieved good F1Score.

Download Full-text

Feature Selection Models Based on Hybrid Firefly Algorithm with Mutation Operator for Network Intrusion Detection

International Journal of Intelligent Engineering and Systems ◽

10.22266/ijies2021.0228.19 ◽

2021 ◽

Vol 14 (1) ◽

pp. 192-202

Author(s):

Karrar Alwan ◽

◽

Ahmed AbuEl-Atta ◽

Hala Zayed ◽

◽

...

Keyword(s):

Feature Selection ◽

Intrusion Detection ◽

Intrusion Detection System ◽

Firefly Algorithm ◽

Detection System ◽

Binary Classification ◽

Feature Reduction ◽

Detection Accuracy ◽

Multi Classification ◽

Modified Firefly Algorithm

Accurate intrusion detection is necessary to preserve network security. However, developing efficient intrusion detection system is a complex problem due to the nonlinear nature of the intrusion attempts, the unpredictable behaviour of network traffic, and the large number features in the problem space. Hence, selecting the most effective and discriminating feature is highly important. Additionally, eliminating irrelevant features can improve the detection accuracy as well as reduce the learning time of machine learning algorithms. However, feature reduction is an NPhard problem. Therefore, several metaheuristics have been employed to determine the most effective feature subset within reasonable time. In this paper, two intrusion detection models are built based on a modified version of the firefly algorithm to achieve the feature selection task. The first and, the second models have been used for binary and multiclass classification, respectively. The modified firefly algorithm employed a mutation operation to avoid trapping into local optima through enhancing the exploration capabilities of the original firefly. The significance of the selected features is evaluated using a Naïve Bayes classifier over a benchmark standard dataset, which contains different types of attacks. The obtained results revealed the superiority of the modified firefly algorithm against the original firefly algorithm in terms of the classification accuracy and the number of selected features under different scenarios. Additionally, the results assured the superiority of the proposed intrusion detection system against other recently proposed systems in both binary classification and multi-classification scenarios. The proposed system has 96.51% and 96.942% detection accuracy in binary classification and multi-classification, respectively. Moreover, the proposed system reduced the number of attributes from 41 to 9 for binary classification and to 10 for multi-classification.

Download Full-text

Network Intrusion Detection and Prevention Systems on Flooding and Worm Attacks

Advances in Digital Crime, Forensics, and Cyber Terrorism - Combating Security Breaches and Criminal Activity in the Digital Sphere ◽

10.4018/978-1-5225-0193-0.ch012 ◽

2016 ◽

pp. 183-207

Author(s):

P. Vetrivelan ◽

M. Jagannath ◽

T. S. Pradeep Kumar

Keyword(s):

Intrusion Detection ◽

Denial Of Service ◽

Packet Transmission ◽

Network Intrusion Detection ◽

The Internet ◽

Network Intrusion ◽

Worm Attacks ◽

Packet Marking ◽

Vast Network ◽

Intrusion Detection And Prevention

The Internet has transformed greatly the improved way of business, this vast network and its associated technologies have opened the doors to an increasing number of security threats which are dangerous to networks. The first part of this chapter presents a new dimension of denial of service attacks called TCP SYN Flood attack has been witnessed for severity of damage and second part on worms which is the major threat to the internet. The TCP SYN Flood attack by means of anomaly detection and traces back the real source of the attack using Modified Efficient Packet Marking algorithm (EPM). The mechanism for detecting the smart natured camouflaging worms which is sensed by means of a technique called Modified Controlled Packet Transmission (MCPT) technique. Finally the network which is affected by these types of worms are detected and recovered by means of Modified Centralized Worm Detector (MCWD) mechanism. The Network Intrusion Detection and Prevention Systems (NIDPS) on Flooding and Worm Attacks were analyzed and presented.

Download Full-text