Fusion of Feature Selection and Random Forest for an Anomaly-Based Intrusion Detection System

Considering current network scenario hackers and intruders has become a big threat today. As new technologies are emerging fast, extensive use of these technologies and computers, what plays an important role is security. Most of the computers in network can be easily compromised with attacks. Big issue of concern is increase in new type of attack these days. Security to the sensitive data is very big threat to deal with, it need to consider as high priority issue which should be addressed immediately. Highly efficient Intrusion Detection Systems (IDS) are available now a days which detects various types of attacks on network. But we require the IDS which is intelligent enough to detect and analyze all type of new threats on the network. Maximum accuracy is expected by any of this intelligent intrusion detection system. An Intrusion Detection System can be hardware or software that analyze and monitors all activities of network to detect malicious activities happened inside the network. It also informs and helps administrator to deal with malicious packets, which if enters in network can harm more number of computers connected together. In our work we have implemented an intellectual IDS which helps administrator to analyze real time network traffic. IDS does it by classifying packets entering into the system as normal or malicious. This paper mainly focus on techniques used for feature selection to reduce number of features from KDD-99 dataset. This paper also explains algorithm used for classification i.e., Random Forest which works with forest of trees to classify real time packet as normal or malicious. Random forest makes use of ensembling techniques to give final output which is derived by combining output from number of trees used to create forest. Dataset which is used while performing experiments is KDD-99. This dataset is used to train all trees to get more accuracy with help of random forest. From results achieved we can observe that random forest algorithm gives more accuracy in distributed network with reduced false alarm rate.

Download Full-text

Enhanced Tree Based Real Time Intrusion Detection System in Big Data

INTERNATIONAL JOURNAL OF COMPUTERS & TECHNOLOGY ◽

10.24297/ijct.v15i3.1671 ◽

2016 ◽

Vol 15 (3) ◽

pp. 6563-6569

Author(s):

S.J.SATHISH AARON JOSEPH ◽

R. BALASUBRAMANIAN

Keyword(s):

Big Data ◽

Random Forest ◽

Intrusion Detection ◽

Real Time ◽

Intrusion Detection System ◽

Detection System ◽

Training Data ◽

Random Forest Algorithm ◽

Digital Form ◽

Hadoop Cluster

Intrusion detection is one of the major necessities of the current networked environment, where every information is available in its corresponding digital form. This paper presents an enhanced tree based approach that can be used to perform intrusion detection faster and with better accuracy. The training data is subject to the random forest algorithm. This algorithm is a combination of tree predictors, and each tree depends upon the random vector generated. Spark based implementations of the Random Forest algorithm is used in a Hadoop cluster on datasets with varied imbalance to obtain the results. It has been observed that the classifier provided results in real time with an accuracy >90%, hence is more appropriate for online intrusion detection.

Download Full-text

Explainable AI and Random Forest Based Reliable Intrusion Detection system

10.36227/techrxiv.17169080 ◽

2021 ◽

Author(s):

Syed Wali ◽

Irfan Khan

Keyword(s):

Random Forest ◽

Intrusion Detection ◽

Intrusion Detection System ◽

Detection System ◽

Cyber Attacks ◽

Detection Systems ◽

Cyber Threats ◽

The Past ◽

Explainable Ai ◽

Series Of Experiments

<p>Emerging Cyber threats with an increased dependency on vulnerable cyber-networks have jeopardized all stakeholders, making Intrusion Detection Systems (IDS) the essential network security requirement. Several IDS have been proposed in the past decade for preventing systems from cyber-attacks. Machine learning (ML) based IDS have shown remarkable performance on conventional cyber threats. However, the introduction of adversarial attacks in the cyber domain highlights the need to upgrade these IDS because conventional ML-based approaches are vulnerable to adversarial attacks. Therefore, the proposed IDS framework leverages the performance of conventional ML-based IDS and integrates it with Explainable AI (XAI) to deal with adversarial attacks. Global Explanation of AI model, extracted by SHAP (Shapley additive explanation) during the training phase of Primary Random Forest Classifier (RFC), is used to reassess the credibility of predicted outcomes. In other words, an outcome with low credibility is reassessed by secondary classifiers. This SHAP-based approach helps in filtering out all disguised malicious network traffic and can also enhance user trust by adding transparency to the decision-making process. Adversarial robustness of the proposed IDS was assessed by Hop Skip Jump Attack and CICIDS dataset, where IDS showed 98.5% and 100% accuracy, respectively. Furthermore, the performance of the proposed IDS is compared with conventional algorithms using recall, precision, accuracy, and F1-score as evaluation metrics. This comparative analysis and series of experiments endorse the credibility of the proposed scheme, depicting that the integration of XAI with conventional IDS can ensure credibility, integrity, and availability of cyber-networks.</p>

Download Full-text

Fuzzy Rule-Based Layered Classifier and Entropy-Based Feature Selection for Intrusion Detection System

Handbook of Research on Cyber Crime and Information Privacy - Advances in Information Security, Privacy, and Ethics ◽

10.4018/978-1-7998-5728-0.ch015 ◽

2021 ◽

pp. 289-309

Author(s):

Devaraju Sellappan ◽

Ramakrishnan Srinivasan

Keyword(s):

Feature Selection ◽

Intrusion Detection ◽

Intrusion Detection System ◽

Detection System ◽

Fuzzy Rule ◽

Intrusion Detection Systems ◽

Rule Based ◽

Detection Systems ◽

Positive Rate ◽

Rule Based Classifier

Intrusion detection systems must detect the vulnerability consistently in a network and also perform efficiently with the huge amount of traffic. Intrusion detection systems must be capable of detecting emerging and proactive threats in the networks. Various classifiers are used to classify the threats as normal or intrusive by supervising the system activity. In this chapter, layered fuzzy rule-based classifier is proposed to detect the various intrusions, and fuzzy entropy-based feature selection is proposed to identify the relevant features. Layered fuzzy rule-based classifier is proposed to improve the performance of the intrusion detection system. KDD dataset contains various attacks; these attacks are grouped into four classes, namely Denial-of-Service (DoS), Probe, Remote-to-Local (R2L), and User-to-Root (U2R). Real-time dataset is also considered in this research. Experimental result shows that the proposed method provides good detection rate, minimizes the false positive rate, and less computational time.

Download Full-text

Improving DDoS Attack Predection Performance using Ensambling Techniqes

International Journal of Recent Technology and Engineering - 2 ◽

10.35940/ijrte.c6860.098319 ◽

2019 ◽

Vol 8 (3) ◽

pp. 4760-4763

Keyword(s):

Random Forest ◽

Intrusion Detection ◽

Intrusion Detection System ◽

Detection System ◽

Feature Subset Selection ◽

Intrusion Detection Systems ◽

Support Vector ◽

Feature Subset ◽

Detection Systems ◽

Ddos Attack

This paper proposes are utilizing support vector machine (SVM), Neural networks and decision tree C5 algorithms for anticipating undesirable data's. To dispose of DoS attack we have the intrusion detection systems however we have to keep up the exhibition of the intrusion detection systems. Along these lines, we propose a novel model for intrusion detection system in cloud platform utilizing random forest classifier and XG Boost model. Random Forest (RF) is a group classifier and performs all around contrasted with other conventional classifiers for viable classification of attacks. Intrusion detection system is made quick and effective by utilization of ideal feature subset selection utilizing IG. In this paper, we showed DDoS anomaly detection on the open Cloud DDoS attack datasets utilizing Random forest and Gradient Boosting (GB) machine learning (ML) model.

Download Full-text

Real-time Distributed-Random-Forest-Based Network Intrusion Detection System Using Apache Spark

2018 IEEE 37th International Performance Computing and Communications Conference (IPCCC) ◽

10.1109/pccc.2018.8711068 ◽

2018 ◽

Cited By ~ 4

Author(s):

Hao Zhang ◽

Shumin Dai ◽

Yongdan Li ◽

Wenjun Zhang

Keyword(s):

Random Forest ◽

Intrusion Detection ◽

Real Time ◽

Intrusion Detection System ◽

Detection System ◽

Apache Spark ◽

Network Intrusion Detection ◽

Network Intrusion ◽

Network Intrusion Detection System

Download Full-text

Intrusion Detection System for Malicious Traffic Using Evolutionary Search Algorithm

Recent Advances in Computer Science and Communications ◽

10.2174/2666255813999200821162547 ◽

2020 ◽

Vol 13 ◽

Author(s):

Samar Al-Saqqa ◽

Mustafa Al-Fayoumi ◽

Malik Qasaimeh

Keyword(s):

Feature Selection ◽

Intrusion Detection ◽

Intrusion Detection System ◽

Search Algorithm ◽

Detection System ◽

Feature Subset Selection ◽

Intrusion Detection Systems ◽

Feature Subset ◽

Evolutionary Search ◽

Detection Systems

Introduction: Intrusion detection systems play a key role in system security by identifying potential attacks and giving appropriate responses. As new attacks are always emerging, intrusion detection systems must adapt to these attacks, and more work is continuously needed to develop and propose new methods and techniques that can improve efficient and effective adaptive intrusion systems. Feature selection is one of the challenging areas that need more work because of its importance and impact on the performance of intrusion detection systems. This paper applies evolutionary search algorithm in feature subset selection for intrusion detection systems. Methods: The evolutionary search algorithm for the feature subset selection is applied and two classifiers are used, Naïve Bayes and decision tree J48, to evaluate system performance before and after features selection. NSL-KDD dataset and its subsets are used in all evaluation experiments. Results: The results show that feature selection using the evolutionary search algorithm enhances the intrusion detection system with respect to detection accuracy and detection of unknown attacks. Furthermore, time performance is achieved by reducing training time, which is reflected positively in overall system performance. Discussion: The evolutionary search applied to select IDS algorithm features can be developed by modifying and enhancing mutation and crossover operators and applying new enhanced techniques in the selection process, which can give better results and enhance the performance of intrusion detection for rare and complicated attacks. Conclusion: The evolutionary search algorithm is applied to find the best subset of features for the intrusion detection system. In conclusion, it is a promising approach to be used as a feature selection method for intrusion detection. The results showed better performance for the intrusion detection system in terms of accuracy and detection rate.

Download Full-text

Building Auto-Encoder Intrusion Detection System based on random forest feature selection

Computers & Security ◽

10.1016/j.cose.2020.101851 ◽

2020 ◽

Vol 95 ◽

pp. 101851 ◽

Cited By ~ 6

Author(s):

XuKui Li ◽

Wei Chen ◽

Qianru Zhang ◽

Lifa Wu

Keyword(s):

Feature Selection ◽

Random Forest ◽

Intrusion Detection ◽

Intrusion Detection System ◽

Detection System

Download Full-text

A novel time efficient learning-based approach for smart intrusion detection system

Journal Of Big Data ◽

10.1186/s40537-021-00498-8 ◽

2021 ◽

Vol 8 (1) ◽

Author(s):

Sugandh Seth ◽

Gurvinder Singh ◽

Kuljit Kaur Chahal

Keyword(s):

Feature Selection ◽

Intrusion Detection ◽

Intrusion Detection System ◽

Detection System ◽

Intrusion Detection Systems ◽

Gradient Boosting ◽

Detection Systems ◽

Network Parameters ◽

Proposed Model ◽

Precision Rate

Abstract Background The ever increasing sophistication of intrusion approaches has led to the dire necessity for developing Intrusion Detection Systems with optimal efficacy. However, existing Intrusion Detection Systems have been developed using outdated attack datasets, with more focus on prediction accuracy and less on prediction latency. The smart Intrusion Detection System framework evolution looks forward to designing and deploying security systems that use various parameters for analyzing current and dynamic traffic trends and are highly time-efficient in predicting intrusions. Aims This paper proposes a novel approach for a time-efficient and smart Intrusion Detection System. Method Herein, we propose a Hybrid Feature Selection approach that aims to reduce the prediction latency without affecting attack prediction performance by lowering the model's complexity. Light Gradient Boosting Machine (LightGBM), a fast gradient boosting framework, is used to build the model on the latest CIC-IDS 2018 dataset. Results The proposed feature selection reduces the prediction latency ranging from 44.52% to 2.25% and the model building time ranging from 52.68% to 17.94% in various algorithms on the CIC-IDS 2018 dataset. The proposed model with hybrid feature selection and LightGBM gives 97.73% accuracy, 96% sensitivity, 99.3% precision rate, and comparatively low prediction latency. The proposed model successfully achieved a raise of 1.5% in accuracy rate and 3% precision rate over the existing model. An in-depth analysis of network parameters is also performed, which gives a deep insight into the variation of network parameters during the benign and malicious sessions.

Download Full-text

Explainable AI and Random Forest Based Reliable Intrusion Detection system

10.36227/techrxiv.17169080.v1 ◽

2021 ◽

Author(s):

Syed Wali ◽

Irfan Khan

Keyword(s):

Random Forest ◽

Intrusion Detection ◽

Intrusion Detection System ◽

Detection System ◽

Cyber Attacks ◽

Detection Systems ◽

Cyber Threats ◽

The Past ◽

Explainable Ai ◽

Series Of Experiments

Download Full-text

Performance analysis of machine learning models for intrusion detection system using Gini Impurity-based Weighted Random Forest (GIWRF) feature selection technique

Cybersecurity ◽

10.1186/s42400-021-00103-8 ◽

2022 ◽

Vol 5 (1) ◽

Author(s):

Raisa Abedin Disha ◽

Sajjad Waheed

Keyword(s):

Machine Learning ◽

Feature Selection ◽

Random Forest ◽

Performance Analysis ◽

Intrusion Detection ◽

Intrusion Detection System ◽

Detection System ◽

Experimental Result ◽

Feature Selection Technique ◽

Selection Technique

AbstractTo protect the network, resources, and sensitive data, the intrusion detection system (IDS) has become a fundamental component of organizations that prevents cybercriminal activities. Several approaches have been introduced and implemented to thwart malicious activities so far. Due to the effectiveness of machine learning (ML) methods, the proposed approach applied several ML models for the intrusion detection system. In order to evaluate the performance of models, UNSW-NB 15 and Network TON_IoT datasets were used for offline analysis. Both datasets are comparatively newer than the NSL-KDD dataset to represent modern-day attacks. However, the performance analysis was carried out by training and testing the Decision Tree (DT), Gradient Boosting Tree (GBT), Multilayer Perceptron (MLP), AdaBoost, Long-Short Term Memory (LSTM), and Gated Recurrent Unit (GRU) for the binary classification task. As the performance of IDS deteriorates with a high dimensional feature vector, an optimum set of features was selected through a Gini Impurity-based Weighted Random Forest (GIWRF) model as the embedded feature selection technique. This technique employed Gini impurity as the splitting criterion of trees and adjusted the weights for two different classes of the imbalanced data to make the learning algorithm understand the class distribution. Based upon the importance score, 20 features were selected from UNSW-NB 15 and 10 features from the Network TON_IoT dataset. The experimental result revealed that DT performed well with the feature selection technique than other trained models of this experiment. Moreover, the proposed GIWRF-DT outperformed other existing methods surveyed in the literature in terms of the F1 score.

Download Full-text