Analysis of Feature Selection and Ensemble Classifier Methods for Intrusion Detection

2018 ◽  
Vol 7 (1) ◽  
pp. 57-72
Author(s):  
H.P. Vinutha ◽  
Poornima Basavaraju

Day by day network security is becoming more challenging task. Intrusion detection systems (IDSs) are one of the methods used to monitor the network activities. Data mining algorithms play a major role in the field of IDS. NSL-KDD'99 dataset is used to study the network traffic pattern which helps us to identify possible attacks takes place on the network. The dataset contains 41 attributes and one class attribute categorized as normal, DoS, Probe, R2L and U2R. In proposed methodology, it is necessary to reduce the false positive rate and improve the detection rate by reducing the dimensionality of the dataset, use of all 41 attributes in detection technology is not good practices. Four different feature selection methods like Chi-Square, SU, Gain Ratio and Information Gain feature are used to evaluate the attributes and unimportant features are removed to reduce the dimension of the data. Ensemble classification techniques like Boosting, Bagging, Stacking and Voting are used to observe the detection rate separately with three base algorithms called Decision stump, J48 and Random forest.

2021 ◽  
Vol 336 ◽  
pp. 08008
Author(s):  
Tao Xie

In order to improve the detection rate and speed of intrusion detection system, this paper proposes a feature selection algorithm. The algorithm uses information gain to rank the features in descending order, and then uses a multi-objective genetic algorithm to gradually search the ranking features to find the optimal feature combination. We classified the Kddcup98 dataset into five classes, DOS, PROBE, R2L, and U2R, and conducted numerous experiments on each class. Experimental results show that for each class of attack, the proposed algorithm can not only speed up the feature selection, but also significantly improve the detection rate of the algorithm.


2020 ◽  
Vol 16 (4) ◽  
pp. 72-86
Author(s):  
Preethi D. ◽  
Neelu Khare

In this article, an EFS-LSTM, a deep recurrent learning model, is proposed for network intrusion detection systems. The EFS-LSTM model uses ensemble-based feature selection (EFS) and LSTM (Long Short Term Memory) for the classification of network intrusions. The EFS combines five feature selection mechanisms namely, information gain, gain ratio, chi-square, correlation-based feature selection, and symmetric uncertainty-based feature selection. The experiments were conducted using the benchmark NSL-KDD dataset and implemented using Tensor flow and python. The EFS-LSTM classifier is evaluated using the classification performance metrics and also compared with all the 41 features without any feature selection as well as with each individual feature selection techniques and classified using LSTM. The performance study showed that the EFS-LSTM model outperforms better with 99.8% accuracy with a higher detection and less false alarm rates.


2010 ◽  
Vol 9 ◽  
pp. CIN.S3794 ◽  
Author(s):  
Xiaosheng Wang ◽  
Osamu Gotoh

Gene selection is of vital importance in molecular classification of cancer using high-dimensional gene expression data. Because of the distinct characteristics inherent to specific cancerous gene expression profiles, developing flexible and robust feature selection methods is extremely crucial. We investigated the properties of one feature selection approach proposed in our previous work, which was the generalization of the feature selection method based on the depended degree of attribute in rough sets. We compared the feature selection method with the established methods: the depended degree, chi-square, information gain, Relief-F and symmetric uncertainty, and analyzed its properties through a series of classification experiments. The results revealed that our method was superior to the canonical depended degree of attribute based method in robustness and applicability. Moreover, the method was comparable to the other four commonly used methods. More importantly, the method can exhibit the inherent classification difficulty with respect to different gene expression datasets, indicating the inherent biology of specific cancers.


2014 ◽  
Vol 52 ◽  
Author(s):  
Ralf C. Staudemeyer ◽  
Christian W. Omlin

This work presents a data preprocessing and feature selection framework to support data mining and network security experts in minimal feature set selection of intrusion detection data. This process is supported by detailed visualisation and examination of class distributions. Distribution histograms, scatter plots and information gain are presented as supportive feature reduction tools. The feature reduction process applied is based on decision tree pruning and backward elimination. This paper starts with an analysis of the KDD Cup '99 datasets and their potential for feature reduction. The dataset consists of connection records with 41 features whose relevance for intrusion detection are not clear. All traffic is either classified `normal' or into the four attack types denial-of-service, network probe, remote-to-local or user-to-root. Using our custom feature selection process, we show how we can significantly reduce the number features in the dataset to a few salient features. We conclude by presenting minimal sets with 4--8 salient features for two-class and multi-class categorisation for detecting intrusions, as well as for the detection of individual attack classes; the performance using a static classifier compares favourably to the performance using all features available. The suggested process is of general nature and can be applied to any similar dataset.


Author(s):  
Chunyong Yin ◽  
Luyu Ma ◽  
Lu Feng

Intrusion detection is a kind of security mechanism which is used to detect attacks and intrusion behaviors. Due to the low accuracy and the high false positive rate of the existing clonal selection algorithms applied to intrusion detection, in this paper, we proposed a feature selection method for improved clonal algorithm. The improved method detects the intrusion behavior by selecting the best individual overall and clones them. Experimental results show that the feature selection algorithm is better than the traditional feature selection algorithm on the different classifiers, and it is shown that the final detection results are better than traditional clonal algorithm with 99.6% accuracy and 0.1% false positive rate.


2021 ◽  
Author(s):  
Jayaprakash Pokala ◽  
B. Lalitha

Abstract Internet of Things (IoT) is the powerful latest trend that allows communications and networking of many sources over the internet. Routing protocol for low power and lossy networks (RPL) based IoT networks may be exposed to many routing attacks due to resource-constrained and open nature of the IoT nodes. Hence, there is a need for network intrusion detection system (NIDS) to protect RPL based IoT networks from routing attacks. The existing techniques for anomaly-based NIDS (ANIDS) subjects to high false alarm rate (FAR). Therefore, a novel bio-inspired voting ensemble classifier with feature selection technique is proposed in this paper to improve the performance of ANIDS for RPL based IoT networks. The proposed voting ensemble classifier combines the results of various base classifiers such as logistic Regression, support vector machine, decision tree, bidirectional long short-term memory and K-nearest neighbor to detect the attacks accurately based on majority voting rule. The optimized weights of base classifiers are obtained by using the feature selection method called simulated annealing based improved salp swarm algorithm (SA-ISSA), which is the hybridization of particle swarm optimization, opposition based learning and salp swarm algorithm. The experiments are performed with RPL-NIDDS17 dataset that contains seven types of attack instances. The performance of the proposed model is evaluated and compared with existing feature selection and classification techniques in terms of accuracy, attack detection rate (ADR), FAR and so on. The proposed ensemble classifier shows better performance with higher accuracy (96.4%), ADR (97.7%) and reduced FAR (3.6%).


2018 ◽  
Vol 2 (1) ◽  
pp. 49-57 ◽  
Author(s):  
Nabeela Ashraf ◽  
Waqar Ahmad ◽  
Rehan Ashraf

Due to the fast growth and tradition of the internet over the last decades, the network security problems are increasing vigorously. Humans can not handle the speed of processes and the huge amount of data required to handle network anomalies. Therefore, it needs substantial automation in both speed and accuracy. Intrusion Detection System is one of the approaches to recognize illegal access and rare attacks to secure networks. In this proposed paper, Naive Bayes, J48 and Random Forest classifiers are compared to compute the detection rate and accuracy of IDS. For experiments, the KDD_NSL dataset is used.


Sign in / Sign up

Export Citation Format

Share Document