Analysis of NSL KDD Dataset Using Classification Algorithms for Intrusion Detection System

Srishti Sharma; Yogita Gigras; Rita Chhikara; Anuradha Dhull

doi:10.2174/1872212112666180402122150

Analysis of NSL KDD Dataset Using Classification Algorithms for Intrusion Detection System

Recent Patents on Engineering ◽

10.2174/1872212112666180402122150 ◽

2019 ◽

Vol 13 (2) ◽

pp. 142-147

Author(s):

Srishti Sharma ◽

Yogita Gigras ◽

Rita Chhikara ◽

Anuradha Dhull

Keyword(s):

Random Forest ◽

Intrusion Detection ◽

Detection System ◽

Random Trees ◽

Attribute Selection ◽

Classification Algorithms ◽

Random Forest Classification ◽

Detection Systems ◽

Forest Classification ◽

Feature Attribute

Background: Intrusion detection systems are responsible for detecting anomalies and network attacks. Building of an effective IDS depends upon the readily available dataset. This dataset is used to train and test intelligent IDS. In this research, NSL KDD dataset (an improvement over original KDD Cup 1999 dataset) is used as KDD’99 contains huge amount of redundant records, which makes it difficult to process the data accurately. Methods: The classification techniques applied on this dataset to analyze the data are decision trees like J48, Random Forest and Random Trees. Results: On comparison of these three classification algorithms, Random Forest was proved to produce the best results and therefore, Random Forest classification method was used to further analyze the data. The results are analyzed and depicted in this paper with the help of feature/attribute selection by applying all the possible combinations. Conclusion: There are total of eight significant attributes selected after applying various attribute selection methods on NSL KDD dataset.

Chronic Kidney Disease for Collaborative Healthcare Data Analytics using Random Forest Classification Algorithms

2021 International Conference on Computer Communication and Informatics (ICCCI) ◽

10.1109/iccci50826.2021.9402574 ◽

2021 ◽

Author(s):

V. Shanmugarajeshwari ◽

M. Ilayaraja

Keyword(s):

Chronic Kidney Disease ◽

Random Forest ◽

Kidney Disease ◽

Data Analytics ◽

Classification Algorithms ◽

Random Forest Classification ◽

Healthcare Data ◽

Forest Classification

Fusion of Feature Selection and Random Forest for an Anomaly-Based Intrusion Detection System

Journal of Computational and Theoretical Nanoscience ◽

10.1166/jctn.2019.8332 ◽

2019 ◽

Vol 16 (8) ◽

pp. 3603-3607 ◽

Cited By ~ 1

Author(s):

Shraddha Khonde ◽

V. Ulagamuthalvi

Keyword(s):

Feature Selection ◽

Random Forest ◽

Intrusion Detection ◽

Real Time ◽

Intrusion Detection System ◽

New Technologies ◽

Detection System ◽

Sensitive Data ◽

Detection Systems ◽

New Type

Considering current network scenario hackers and intruders has become a big threat today. As new technologies are emerging fast, extensive use of these technologies and computers, what plays an important role is security. Most of the computers in network can be easily compromised with attacks. Big issue of concern is increase in new type of attack these days. Security to the sensitive data is very big threat to deal with, it need to consider as high priority issue which should be addressed immediately. Highly efficient Intrusion Detection Systems (IDS) are available now a days which detects various types of attacks on network. But we require the IDS which is intelligent enough to detect and analyze all type of new threats on the network. Maximum accuracy is expected by any of this intelligent intrusion detection system. An Intrusion Detection System can be hardware or software that analyze and monitors all activities of network to detect malicious activities happened inside the network. It also informs and helps administrator to deal with malicious packets, which if enters in network can harm more number of computers connected together. In our work we have implemented an intellectual IDS which helps administrator to analyze real time network traffic. IDS does it by classifying packets entering into the system as normal or malicious. This paper mainly focus on techniques used for feature selection to reduce number of features from KDD-99 dataset. This paper also explains algorithm used for classification i.e., Random Forest which works with forest of trees to classify real time packet as normal or malicious. Random forest makes use of ensembling techniques to give final output which is derived by combining output from number of trees used to create forest. Dataset which is used while performing experiments is KDD-99. This dataset is used to train all trees to get more accuracy with help of random forest. From results achieved we can observe that random forest algorithm gives more accuracy in distributed network with reduced false alarm rate.

Machine Learning Techniques for Intrusion Detection

Handbook of Research on Intrusion Detection Systems - Advances in Information Security, Privacy, and Ethics ◽

10.4018/978-1-7998-2242-4.ch003 ◽

2020 ◽

pp. 47-65

Author(s):

Tameem Ahmad ◽

Mohd Asad Anwar ◽

Misbahul Haque

Keyword(s):

Random Forest ◽

Intrusion Detection ◽

False Alarm ◽

False Alarm Rate ◽

Detection Rate ◽

Clustering Algorithms ◽

Training Data ◽

Hybrid Classifier ◽

Random Forest Classification ◽

Forest Classification

This chapter proposes a hybrid classifier technique for network Intrusion Detection System by implementing a method that combines Random Forest classification technique with K-Means and Gaussian Mixture clustering algorithms. Random-forest will build patterns of intrusion over a training data in misuse-detection, while anomaly-detection intrusions will be identiðed by the outlier-detection mechanism. The implementation and simulation of the proposed method for various metrics are carried out under varying threshold values. The effectiveness of the proposed method has been carried out for metrics such as precision, recall, accuracy rate, false alarm rate, and detection rate. The various existing algorithms are analyzed extensively. It is observed experimentally that the proposed method gives superior results compared to the existing simpler classifiers as well as existing hybrid classifier techniques. The proposed hybrid classifier technique outperforms other common existing classifiers with an accuracy of 99.84%, false alarm rate as 0.09% and the detection rate as 99.7%.

Efficient Data Mining Techniques for Heart Disease Prediction and Comparative Analysis of Classification Algorithms

Asian Journal of Research in Computer Science ◽

10.9734/ajrcos/2021/v12i230281 ◽

2021 ◽

pp. 57-68

Author(s):

Md. Ashikur Rahman Khan ◽

Masudur Rahman ◽

Jayed Us Salehin ◽

Md. Saiful Islam ◽

Md. Fazle Rabbi

Keyword(s):

Data Mining ◽

Heart Disease ◽

Random Forest ◽

Heart Diseases ◽

Classification Algorithms ◽

Middle Income ◽

Data Mining Techniques ◽

Random Forest Classification ◽

Forest Classification ◽

Efficient Data

Data mining techniques are used to extract interesting patterns and discover meaningful knowledge from huge amount of data. There has been increasing in usage of data mining techniques on medical data for determining useful trends and patterns that are used in analysis and decision making. About eighty percent of human deaths occurred in low and middle-income countries due to heart diseases. The healthcare industry generates large amount of heart disease data which are not organized. These data make the prediction process more complicated and voluminous. Data mining provides the techniques for fast and accurate transformation of data into useful information for heart diseases prediction. The main objectives of this research is to predict heart diseases more accurately using Naïve Bayes, J48 Decision Tree, Neural Network, Random Forest classification algorithms and compare the performance of classifiers. The research uses raw dataset for performance analysis and the analysis is based on Weka Tool. This research also shows best technique from them which is Random Forest on the basis of accuracy and execution time.

Explainable AI and Random Forest Based Reliable Intrusion Detection system

10.36227/techrxiv.17169080 ◽

2021 ◽

Author(s):

Syed Wali ◽

Irfan Khan

Keyword(s):

Random Forest ◽

Intrusion Detection ◽

Intrusion Detection System ◽

Detection System ◽

Cyber Attacks ◽

Detection Systems ◽

Cyber Threats ◽

The Past ◽

Explainable Ai ◽

Series Of Experiments

<p>Emerging Cyber threats with an increased dependency on vulnerable cyber-networks have jeopardized all stakeholders, making Intrusion Detection Systems (IDS) the essential network security requirement. Several IDS have been proposed in the past decade for preventing systems from cyber-attacks. Machine learning (ML) based IDS have shown remarkable performance on conventional cyber threats. However, the introduction of adversarial attacks in the cyber domain highlights the need to upgrade these IDS because conventional ML-based approaches are vulnerable to adversarial attacks. Therefore, the proposed IDS framework leverages the performance of conventional ML-based IDS and integrates it with Explainable AI (XAI) to deal with adversarial attacks. Global Explanation of AI model, extracted by SHAP (Shapley additive explanation) during the training phase of Primary Random Forest Classifier (RFC), is used to reassess the credibility of predicted outcomes. In other words, an outcome with low credibility is reassessed by secondary classifiers. This SHAP-based approach helps in filtering out all disguised malicious network traffic and can also enhance user trust by adding transparency to the decision-making process. Adversarial robustness of the proposed IDS was assessed by Hop Skip Jump Attack and CICIDS dataset, where IDS showed 98.5% and 100% accuracy, respectively. Furthermore, the performance of the proposed IDS is compared with conventional algorithms using recall, precision, accuracy, and F1-score as evaluation metrics. This comparative analysis and series of experiments endorse the credibility of the proposed scheme, depicting that the integration of XAI with conventional IDS can ensure credibility, integrity, and availability of cyber-networks.</p>

TR-IDS: Anomaly-Based Intrusion Detection through Text-Convolutional Neural Network and Random Forest

Security and Communication Networks ◽

10.1155/2018/4943509 ◽

2018 ◽

Vol 2018 ◽

pp. 1-9 ◽

Cited By ~ 22

Author(s):

Erxue Min ◽

Jun Long ◽

Qiang Liu ◽

Jianjing Cui ◽

Wei Chen

Keyword(s):

Neural Network ◽

Random Forest ◽

Intrusion Detection ◽

Convolutional Neural Network ◽

Detection System ◽

Statistical Features ◽

Detection Systems ◽

Network Intrusion ◽

Network Intrusion Detection Systems ◽

Packet Header

As we head towards the IoT (Internet of Things) era, protecting network infrastructures and information security has become increasingly crucial. In recent years, Anomaly-Based Network Intrusion Detection Systems (ANIDSs) have gained extensive attention for their capability of detecting novel attacks. However, most ANIDSs focus on packet header information and omit the valuable information in payloads, despite the fact that payload-based attacks have become ubiquitous. In this paper, we propose a novel intrusion detection system named TR-IDS, which takes advantage of both statistical features and payload features. Word embedding and text-convolutional neural network (Text-CNN) are applied to extract effective information from payloads. After that, the sophisticated random forest algorithm is performed on the combination of statistical features and payload features. Extensive experimental evaluations demonstrate the effectiveness of the proposed methods.

Vacant Parking Lot Detection System Using Random Forest Classification

2019 3rd International Conference on Computing Methodologies and Communication (ICCMC) ◽

10.1109/iccmc.2019.8819689 ◽

2019 ◽

Cited By ~ 4

Author(s):

Suthapalli Uday Raj ◽

Mummidi Veera Manikanta ◽

Paduchuri Sesha Sai Harsitha ◽

M. Judith Leo

Keyword(s):

Random Forest ◽

Detection System ◽

Random Forest Classification ◽

Forest Classification ◽

Parking Lot

Intrusion Detection Based on Approximate Information Entropy for Random Forest Classification

Proceedings of the 2019 4th International Conference on Big Data and Computing - ICBDC 2019 ◽

10.1145/3335484.3335488 ◽

2019 ◽

Author(s):

Le Yang ◽

Manchun Cai ◽

Yongcheng Duan ◽

Xue Yang

Keyword(s):

Random Forest ◽

Intrusion Detection ◽

Information Entropy ◽

Random Forest Classification ◽

Forest Classification

Improving DDoS Attack Predection Performance using Ensambling Techniqes

International Journal of Recent Technology and Engineering - 2 ◽

10.35940/ijrte.c6860.098319 ◽

2019 ◽

Vol 8 (3) ◽

pp. 4760-4763

Keyword(s):

Random Forest ◽

Intrusion Detection ◽

Intrusion Detection System ◽

Detection System ◽

Feature Subset Selection ◽

Intrusion Detection Systems ◽

Support Vector ◽

Feature Subset ◽

Detection Systems ◽

Ddos Attack

This paper proposes are utilizing support vector machine (SVM), Neural networks and decision tree C5 algorithms for anticipating undesirable data's. To dispose of DoS attack we have the intrusion detection systems however we have to keep up the exhibition of the intrusion detection systems. Along these lines, we propose a novel model for intrusion detection system in cloud platform utilizing random forest classifier and XG Boost model. Random Forest (RF) is a group classifier and performs all around contrasted with other conventional classifiers for viable classification of attacks. Intrusion detection system is made quick and effective by utilization of ideal feature subset selection utilizing IG. In this paper, we showed DDoS anomaly detection on the open Cloud DDoS attack datasets utilizing Random forest and Gradient Boosting (GB) machine learning (ML) model.

An Effective Intrusion Detection Model Based on Random Forest and Neural Networks

Advanced Materials Research ◽

10.4028/www.scientific.net/amr.267.308 ◽

2011 ◽

Vol 267 ◽

pp. 308-313 ◽

Cited By ~ 3

Author(s):

Shao Hong Zhong ◽

Hua Jun Huang ◽

Ai Bin Chen

Keyword(s):

Neural Networks ◽

Random Forest ◽

Intrusion Detection ◽

Detection System ◽

White Paper ◽

Computationally Efficient ◽

Detection Model ◽

Detection Systems ◽

Network Intrusion ◽

Feature Selection Approach

This document explains and demonstrates how to prepare your camera-ready manuscript for Trans Tech Publications. The best is to read these instructions and follow the outline of this text. The text area for your manuscript must be 17 cm wide and 25 cm high (6.7 and 9.8 inches, resp.). Do not place any text outside this area. Use good quality, white paper of approximately 21 x 29 cm or 8 x 11 inches (please do not change the document setting from A4 to letter). Your manuscript will be reduced by approximately 20% by the publisher. Please keep this in mind when designing your figures and tables etc.Intrusion detection is a very important research domain in network security. Current intrusion detection systems (IDS) especially NIDS (Network Intrusion Detection System) examine all data features to detect intrusions. Also, many machine learning and data mining methods are utilized to fulfill intrusion detection tasks. This paper proposes an effective intrusion detection model that is computationally efficient and effective based on Random Forest based feature selection approach and Neural Networks (NN) model. We firstly utilize random forest method to select the most important features to eliminate the insignificant and/or useless inputs leads to a simplification of the problem, in order to faster and more accurate detection; Secondly, classic NN model is used to learn and detect intrusions using the selected important features. Experimental results on the well-known KDD 1999 dataset demonstrate the proposed hybrid model is actually effective.