Intrusion Detection Using Random Forests Classifier with SMOTE and Feature Reduction

This work presents a data preprocessing and feature selection framework to support data mining and network security experts in minimal feature set selection of intrusion detection data. This process is supported by detailed visualisation and examination of class distributions. Distribution histograms, scatter plots and information gain are presented as supportive feature reduction tools. The feature reduction process applied is based on decision tree pruning and backward elimination. This paper starts with an analysis of the KDD Cup '99 datasets and their potential for feature reduction. The dataset consists of connection records with 41 features whose relevance for intrusion detection are not clear. All traffic is either classified `normal' or into the four attack types denial-of-service, network probe, remote-to-local or user-to-root. Using our custom feature selection process, we show how we can significantly reduce the number features in the dataset to a few salient features. We conclude by presenting minimal sets with 4--8 salient features for two-class and multi-class categorisation for detecting intrusions, as well as for the detection of individual attack classes; the performance using a static classifier compares favourably to the performance using all features available. The suggested process is of general nature and can be applied to any similar dataset.

Download Full-text

Feature Selection Models Based on Hybrid Firefly Algorithm with Mutation Operator for Network Intrusion Detection

International Journal of Intelligent Engineering and Systems ◽

10.22266/ijies2021.0228.19 ◽

2021 ◽

Vol 14 (1) ◽

pp. 192-202

Author(s):

Karrar Alwan ◽

◽

Ahmed AbuEl-Atta ◽

Hala Zayed ◽

◽

...

Keyword(s):

Feature Selection ◽

Intrusion Detection ◽

Intrusion Detection System ◽

Firefly Algorithm ◽

Detection System ◽

Binary Classification ◽

Feature Reduction ◽

Detection Accuracy ◽

Multi Classification ◽

Modified Firefly Algorithm

Accurate intrusion detection is necessary to preserve network security. However, developing efficient intrusion detection system is a complex problem due to the nonlinear nature of the intrusion attempts, the unpredictable behaviour of network traffic, and the large number features in the problem space. Hence, selecting the most effective and discriminating feature is highly important. Additionally, eliminating irrelevant features can improve the detection accuracy as well as reduce the learning time of machine learning algorithms. However, feature reduction is an NPhard problem. Therefore, several metaheuristics have been employed to determine the most effective feature subset within reasonable time. In this paper, two intrusion detection models are built based on a modified version of the firefly algorithm to achieve the feature selection task. The first and, the second models have been used for binary and multiclass classification, respectively. The modified firefly algorithm employed a mutation operation to avoid trapping into local optima through enhancing the exploration capabilities of the original firefly. The significance of the selected features is evaluated using a Naïve Bayes classifier over a benchmark standard dataset, which contains different types of attacks. The obtained results revealed the superiority of the modified firefly algorithm against the original firefly algorithm in terms of the classification accuracy and the number of selected features under different scenarios. Additionally, the results assured the superiority of the proposed intrusion detection system against other recently proposed systems in both binary classification and multi-classification scenarios. The proposed system has 96.51% and 96.942% detection accuracy in binary classification and multi-classification, respectively. Moreover, the proposed system reduced the number of attributes from 41 to 9 for binary classification and to 10 for multi-classification.

Download Full-text

Feature reduction scheme for anomaly‐based intrusion detection in wireless networks: Building of hybrid model

Transactions on Emerging Telecommunications Technologies ◽

10.1002/ett.4367 ◽

2021 ◽

Author(s):

Shashank Gavel ◽

Jyotsana Singh ◽

Namrata Shukla ◽

Ajay Singh Raghuvanshi ◽

Sudarshan Tiwari

Keyword(s):

Wireless Networks ◽

Intrusion Detection ◽

Hybrid Model ◽

Feature Reduction ◽

Reduction Scheme

Download Full-text

Anomaly-based Intrusion Detection in Industrial Data with SVM and Random Forests

2019 International Conference on Software, Telecommunications and Computer Networks (SoftCOM) ◽

10.23919/softcom.2019.8903672 ◽

2019 ◽

Cited By ~ 9

Author(s):

Simon D. Duque Anton ◽

Sapna Sinha ◽

Hans Dieter Schotten

Keyword(s):

Intrusion Detection ◽

Random Forests ◽

Industrial Data

Download Full-text

Machine Learning Techniques for Feature Reduction in Intrusion Detection Systems: A Comparison

2009 Fourth International Conference on Computer Sciences and Convergence Information Technology ◽

10.1109/iccit.2009.89 ◽

2009 ◽

Cited By ~ 9

Author(s):

M. Bahrololum ◽

E. Salahi ◽

M. Khaleghi

Keyword(s):

Machine Learning ◽

Intrusion Detection ◽

Feature Reduction ◽

Intrusion Detection Systems ◽

Machine Learning Techniques ◽

Detection Systems ◽

Learning Techniques

Download Full-text

Multilevel Intrusion Detection System with Affinity Clustering and Ensemble SVM

International Journal of Scientific Research in Computer Science Engineering and Information Technology ◽

10.32628/cseit206431 ◽

2020 ◽

pp. 522-529

Author(s):

Sadhana Patidar ◽

Priyanka Parihar ◽

Chetan Agrawal

Keyword(s):

Machine Learning ◽

Intrusion Detection ◽

Intrusion Detection System ◽

Detection System ◽

Model Performance ◽

Feature Reduction ◽

Support Vector ◽

Learning Tools ◽

Security Issues ◽

Proposed Model

Now-a-days with growing applications over internet increases the security issues over network. Many security applications are designed to cope with such security concerns but still it required more attention to improve speed as well accuracy. With advancement of technologies there is also evolution of new threats or attacks in network. So, it is required to design such detection system that can handle new threats in network. One of the network security tools is intrusion detection system which is used to detect malicious data packets. Machine learning tool is also used to improve efficiency of network-based intrusion detection system. In this paper, an intrusion detection system is proposed with an application of machine learning tools. The proposed model integrates feature reduction, affinity clustering and multilevel Ensemble Support Vector Machine. The proposed model performance is analyzed over two datasets i.e. NSL-KDD and UNSW-NB 15 dataset and achieved approx. 12% of efficiency over other existing work.

Download Full-text

Feature Reduction and Classifications Techniques for Intrusion Detection System

2020 International Conference on Communication and Signal Processing (ICCSP) ◽

10.1109/iccsp48568.2020.9182216 ◽

2020 ◽

Author(s):

Gulab Sah ◽

Subhasish Banerjee

Keyword(s):

Intrusion Detection ◽

Intrusion Detection System ◽

Detection System ◽

Feature Reduction

Download Full-text

Performance Analysis of Intrusion Detection Systems Using a Feature Selection Method on the UNSW-NB15 Dataset

Journal Of Big Data ◽

10.1186/s40537-020-00379-6 ◽

2020 ◽

Vol 7 (1) ◽

Author(s):

Sydney M. Kasongo ◽

Yanxia Sun

Keyword(s):

Feature Selection ◽

Intrusion Detection ◽

Computer Networks ◽

Feature Selection Method ◽

Selection Method ◽

Feature Reduction ◽

Intrusion Detection Systems ◽

Support Vector ◽

Test Accuracy ◽

Detection Systems

AbstractComputer networks intrusion detection systems (IDSs) and intrusion prevention systems (IPSs) are critical aspects that contribute to the success of an organization. Over the past years, IDSs and IPSs using different approaches have been developed and implemented to ensure that computer networks within enterprises are secure, reliable and available. In this paper, we focus on IDSs that are built using machine learning (ML) techniques. IDSs based on ML methods are effective and accurate in detecting networks attacks. However, the performance of these systems decreases for high dimensional data spaces. Therefore, it is crucial to implement an appropriate feature extraction method that can prune some of the features that do not possess a great impact in the classification process. Moreover, many of the ML based IDSs suffer from an increase in false positive rate and a low detection accuracy when the models are trained on highly imbalanced datasets. In this paper, we present an analysis the UNSW-NB15 intrusion detection dataset that will be used for training and testing our models. Moreover, we apply a filter-based feature reduction technique using the XGBoost algorithm. We then implement the following ML approaches using the reduced feature space: Support Vector Machine (SVM), k-Nearest-Neighbour (kNN), Logistic Regression (LR), Artificial Neural Network (ANN) and Decision Tree (DT). In our experiments, we considered both the binary and multiclass classification configurations. The results demonstrated that the XGBoost-based feature selection method allows for methods such as the DT to increase its test accuracy from 88.13 to 90.85% for the binary classification scheme.

Download Full-text