Analysis of Feature Selection and Ensemble Classifier Methods for Intrusion Detection

H.P. Vinutha; Poornima Basavaraju

doi:10.4018/ijncr.2018010104

Analysis of Feature Selection and Ensemble Classifier Methods for Intrusion Detection

International Journal of Natural Computing Research ◽

10.4018/ijncr.2018010104 ◽

2018 ◽

Vol 7 (1) ◽

pp. 57-72

Author(s):

H.P. Vinutha ◽

Poornima Basavaraju

Keyword(s):

Feature Selection ◽

Intrusion Detection ◽

Detection Rate ◽

Information Gain ◽

False Positive Rate ◽

Ensemble Classifier ◽

Ensemble Classification ◽

Chi Square ◽

Traffic Pattern ◽

Data Mining Algorithms

Day by day network security is becoming more challenging task. Intrusion detection systems (IDSs) are one of the methods used to monitor the network activities. Data mining algorithms play a major role in the field of IDS. NSL-KDD'99 dataset is used to study the network traffic pattern which helps us to identify possible attacks takes place on the network. The dataset contains 41 attributes and one class attribute categorized as normal, DoS, Probe, R2L and U2R. In proposed methodology, it is necessary to reduce the false positive rate and improve the detection rate by reducing the dimensionality of the dataset, use of all 41 attributes in detection technology is not good practices. Four different feature selection methods like Chi-Square, SU, Gain Ratio and Information Gain feature are used to evaluate the attributes and unimportant features are removed to reduce the dimension of the data. Ensemble classification techniques like Boosting, Bagging, Stacking and Voting are used to observe the detection rate separately with three base algorithms called Decision stump, J48 and Random forest.

A feature selection algorithm combining information gain and multi-objective genetic search for intrusion detection system

MATEC Web of Conferences ◽

10.1051/matecconf/202133608008 ◽

2021 ◽

Vol 336 ◽

pp. 08008

Author(s):

Tao Xie

Keyword(s):

Feature Selection ◽

Intrusion Detection ◽

Intrusion Detection System ◽

Detection Rate ◽

Information Gain ◽

Detection System ◽

Selection Algorithm ◽

Feature Selection Algorithm ◽

Genetic Search ◽

Multi Objective

In order to improve the detection rate and speed of intrusion detection system, this paper proposes a feature selection algorithm. The algorithm uses information gain to rank the features in descending order, and then uses a multi-objective genetic algorithm to gradually search the ranking features to find the optimal feature combination. We classified the Kddcup98 dataset into five classes, DOS, PROBE, R2L, and U2R, and conducted numerous experiments on each class. Experimental results show that for each class of attack, the proposed algorithm can not only speed up the feature selection, but also significantly improve the detection rate of the algorithm.

EFS-LSTM (Ensemble-Based Feature Selection With LSTM) Classifier for Intrusion Detection System

International Journal of e-Collaboration ◽

10.4018/ijec.2020100106 ◽

2020 ◽

Vol 16 (4) ◽

pp. 72-86

Author(s):

Preethi D. ◽

Neelu Khare

Keyword(s):

Feature Selection ◽

Intrusion Detection ◽

Performance Metrics ◽

Short Term Memory ◽

Information Gain ◽

Detection System ◽

Classification Performance ◽

Performance Study ◽

Chi Square ◽

Network Intrusion

In this article, an EFS-LSTM, a deep recurrent learning model, is proposed for network intrusion detection systems. The EFS-LSTM model uses ensemble-based feature selection (EFS) and LSTM (Long Short Term Memory) for the classification of network intrusions. The EFS combines five feature selection mechanisms namely, information gain, gain ratio, chi-square, correlation-based feature selection, and symmetric uncertainty-based feature selection. The experiments were conducted using the benchmark NSL-KDD dataset and implemented using Tensor flow and python. The EFS-LSTM classifier is evaluated using the classification performance metrics and also compared with all the 41 features without any feature selection as well as with each individual feature selection techniques and classified using LSTM. The performance study showed that the EFS-LSTM model outperforms better with 99.8% accuracy with a higher detection and less false alarm rates.

A Robust Gene selection Method for Microarray-based Cancer Classification

Cancer Informatics ◽

10.4137/cin.s3794 ◽

2010 ◽

Vol 9 ◽

pp. CIN.S3794 ◽

Cited By ~ 21

Author(s):

Xiaosheng Wang ◽

Osamu Gotoh

Keyword(s):

Gene Expression ◽

Feature Selection ◽

Gene Selection ◽

Information Gain ◽

Expression Profiles ◽

Feature Selection Method ◽

Gene Expression Profiles ◽

Molecular Classification ◽

Selection Method ◽

Chi Square

Gene selection is of vital importance in molecular classification of cancer using high-dimensional gene expression data. Because of the distinct characteristics inherent to specific cancerous gene expression profiles, developing flexible and robust feature selection methods is extremely crucial. We investigated the properties of one feature selection approach proposed in our previous work, which was the generalization of the feature selection method based on the depended degree of attribute in rough sets. We compared the feature selection method with the established methods: the depended degree, chi-square, information gain, Relief-F and symmetric uncertainty, and analyzed its properties through a series of classification experiments. The results revealed that our method was superior to the canonical depended degree of attribute based method in robustness and applicability. Moreover, the method was comparable to the other four commonly used methods. More importantly, the method can exhibit the inherent classification difficulty with respect to different gene expression datasets, indicating the inherent biology of specific cancers.

Extracting salient features for network intrusion detection using machine learning methods

South African Computer Journal ◽

10.18489/sacj.v52i0.200 ◽

2014 ◽

Vol 52 ◽

Cited By ~ 13

Author(s):

Ralf C. Staudemeyer ◽

Christian W. Omlin

Keyword(s):

Feature Selection ◽

Intrusion Detection ◽

Information Gain ◽

Selection Process ◽

Denial Of Service ◽

Reduction Process ◽

Feature Reduction ◽

General Nature ◽

Network Intrusion ◽

Salient Features

This work presents a data preprocessing and feature selection framework to support data mining and network security experts in minimal feature set selection of intrusion detection data. This process is supported by detailed visualisation and examination of class distributions. Distribution histograms, scatter plots and information gain are presented as supportive feature reduction tools. The feature reduction process applied is based on decision tree pruning and backward elimination. This paper starts with an analysis of the KDD Cup '99 datasets and their potential for feature reduction. The dataset consists of connection records with 41 features whose relevance for intrusion detection are not clear. All traffic is either classified `normal' or into the four attack types denial-of-service, network probe, remote-to-local or user-to-root. Using our custom feature selection process, we show how we can significantly reduce the number features in the dataset to a few salient features. We conclude by presenting minimal sets with 4--8 salient features for two-class and multi-class categorisation for detecting intrusions, as well as for the detection of individual attack classes; the performance using a static classifier compares favourably to the performance using all features available. The suggested process is of general nature and can be applied to any similar dataset.

Building an efficient intrusion detection system based on feature selection and ensemble classifier

Computer Networks ◽

10.1016/j.comnet.2020.107247 ◽

2020 ◽

Vol 174 ◽

pp. 107247 ◽

Cited By ~ 14

Author(s):

Yuyang Zhou ◽

Guang Cheng ◽

Shanqing Jiang ◽

Mian Dai

Keyword(s):

Feature Selection ◽

Intrusion Detection ◽

Intrusion Detection System ◽

Detection System ◽

Ensemble Classifier

Intrusion detection method based on information gain and ReliefF feature selection

2019 International Joint Conference on Neural Networks (IJCNN) ◽

10.1109/ijcnn.2019.8851756 ◽

2019 ◽

Author(s):

Yong Zhang ◽

Xuezhen Ren ◽

Jie Zhang

Keyword(s):

Feature Selection ◽

Intrusion Detection ◽

Detection Method ◽

Information Gain

A Feature Selection Method for Improved Clonal Algorithm Towards Intrusion Detection

International Journal of Pattern Recognition and Artificial Intelligence ◽

10.1142/s0218001416590138 ◽

2016 ◽

Vol 30 (05) ◽

pp. 1659013 ◽

Cited By ~ 7

Author(s):

Chunyong Yin ◽

Luyu Ma ◽

Lu Feng

Keyword(s):

Feature Selection ◽

Intrusion Detection ◽

False Positive ◽

False Positive Rate ◽

Feature Selection Method ◽

Selection Method ◽

Selection Algorithm ◽

Feature Selection Algorithm ◽

Positive Rate ◽

Better Than

Intrusion detection is a kind of security mechanism which is used to detect attacks and intrusion behaviors. Due to the low accuracy and the high false positive rate of the existing clonal selection algorithms applied to intrusion detection, in this paper, we proposed a feature selection method for improved clonal algorithm. The improved method detects the intrusion behavior by selecting the best individual overall and clones them. Experimental results show that the feature selection algorithm is better than the traditional feature selection algorithm on the different classifiers, and it is shown that the final detection results are better than traditional clonal algorithm with 99.6% accuracy and 0.1% false positive rate.

A Novel Intrusion Detection System for RPL Based IoT Networks with Bio-Inspired Feature Selection and Ensemble Classifier

10.21203/rs.3.rs-442429/v1 ◽

2021 ◽

Author(s):

Jayaprakash Pokala ◽

B. Lalitha

Keyword(s):

Feature Selection ◽

Intrusion Detection ◽

Intrusion Detection System ◽

Detection System ◽

Ensemble Classifier ◽

Support Vector ◽

Routing Attacks ◽

Salp Swarm Algorithm ◽

Network Intrusion ◽

Swarm Algorithm

Abstract Internet of Things (IoT) is the powerful latest trend that allows communications and networking of many sources over the internet. Routing protocol for low power and lossy networks (RPL) based IoT networks may be exposed to many routing attacks due to resource-constrained and open nature of the IoT nodes. Hence, there is a need for network intrusion detection system (NIDS) to protect RPL based IoT networks from routing attacks. The existing techniques for anomaly-based NIDS (ANIDS) subjects to high false alarm rate (FAR). Therefore, a novel bio-inspired voting ensemble classifier with feature selection technique is proposed in this paper to improve the performance of ANIDS for RPL based IoT networks. The proposed voting ensemble classifier combines the results of various base classifiers such as logistic Regression, support vector machine, decision tree, bidirectional long short-term memory and K-nearest neighbor to detect the attacks accurately based on majority voting rule. The optimized weights of base classifiers are obtained by using the feature selection method called simulated annealing based improved salp swarm algorithm (SA-ISSA), which is the hybridization of particle swarm optimization, opposition based learning and salp swarm algorithm. The experiments are performed with RPL-NIDDS17 dataset that contains seven types of attack instances. The performance of the proposed model is evaluated and compared with existing feature selection and classification techniques in terms of accuracy, attack detection rate (ADR), FAR and so on. The proposed ensemble classifier shows better performance with higher accuracy (96.4%), ADR (97.7%) and reduced FAR (3.6%).

The Application of Multi-Class Support Vector Machines on Intrusion Detection System with the Feature Selection using Information Gain

Proceedings of the 1st Annual International Conference on Mathematics, Science, and Education (ICoMSE 2017) ◽

10.2991/icomse-17.2018.1 ◽

2018 ◽

Author(s):

Jihan Maharani ◽

Zuherman Rustam

Keyword(s):

Feature Selection ◽

Support Vector Machines ◽

Intrusion Detection ◽

Intrusion Detection System ◽

Information Gain ◽

Detection System ◽

Support Vector ◽

Vector Machines

A Comparative Study of Data Mining Algorithms for High Detection Rate in Intrusion Detection System

Annals of Emerging Technologies in Computing ◽

10.33166/aetic.2018.01.005 ◽

2018 ◽

Vol 2 (1) ◽

pp. 49-57 ◽

Cited By ~ 10

Author(s):

Nabeela Ashraf ◽

Waqar Ahmad ◽

Rehan Ashraf

Keyword(s):

Intrusion Detection ◽

Intrusion Detection System ◽

Detection Rate ◽

Detection System ◽

Huge Amount ◽

High Detection Rate ◽

Data Mining Algorithms ◽

Secure Networks ◽

Mining Algorithms ◽

Speed And Accuracy

Due to the fast growth and tradition of the internet over the last decades, the network security problems are increasing vigorously. Humans can not handle the speed of processes and the huge amount of data required to handle network anomalies. Therefore, it needs substantial automation in both speed and accuracy. Intrusion Detection System is one of the approaches to recognize illegal access and rare attacks to secure networks. In this proposed paper, Naive Bayes, J48 and Random Forest classifiers are compared to compute the detection rate and accuracy of IDS. For experiments, the KDD_NSL dataset is used.