Intrusion Detection Model for Imbalanced Dataset Using SMOTE and Random Forest Algorithm

2021 ◽  
pp. 361-378
Author(s):  
Reem Alshamy ◽  
Mossa Ghurab ◽  
Suad Othman ◽  
Faisal Alshami
2016 ◽  
Vol 15 (3) ◽  
pp. 6563-6569
Author(s):  
S.J.SATHISH AARON JOSEPH ◽  
R. BALASUBRAMANIAN

Intrusion detection is one of the major necessities of the current networked environment, where every information is available in its corresponding digital form. This paper presents an enhanced tree based approach that can be used to perform intrusion detection faster and with better accuracy. The training data is subject to the random forest algorithm. This algorithm is a combination of tree predictors, and each tree depends upon the random vector generated. Spark based implementations of the Random Forest algorithm is used in a Hadoop cluster on datasets with varied imbalance to obtain the results. It has been observed that the classifier provided results in real time with an accuracy >90%, hence is more appropriate for online intrusion detection.


Information ◽  
2019 ◽  
Vol 10 (5) ◽  
pp. 159 ◽  
Author(s):  
Al Tobi ◽  
Duncan

Network traffic exhibits a high level of variability over short periods of time. This variability impacts negatively on the accuracy of anomaly-based network intrusion detection systems (IDS) that are built using predictive models in a batch learning setup. This work investigates how adapting the discriminating threshold of model predictions, specifically to the evaluated traffic, improves the detection rates of these intrusion detection models. Specifically, this research studied the adaptability features of three well known machine learning algorithms: C5.0, Random Forest and Support Vector Machine. Each algorithm’s ability to adapt their prediction thresholds was assessed and analysed under different scenarios that simulated real world settings using the prospective sampling approach. Multiple IDS datasets were used for the analysis, including a newly generated dataset (STA2018). This research demonstrated empirically the importance of threshold adaptation in improving the accuracy of detection models when training and evaluation traffic have different statistical properties. Tests were undertaken to analyse the effects of feature selection and data balancing on model accuracy when different significant features in traffic were used. The effects of threshold adaptation on improving accuracy were statistically analysed. Of the three compared algorithms, Random Forest was the most adaptable and had the highest detection rates.


2019 ◽  
Vol 8 (4) ◽  
pp. 5054-5058

Malicious threats are better known by their work of damages. This damages are not just limited to the system, but it might lead to significant information damage too. Along with this, threats are also responsible for financial loss. As technology increases, Types and attacks of threats also increases. Though the research community investigated a number of cyber attack prevention models it is challenging to detect the threat and preventing them from data, for the industries. Detection of the attacks with IDS is common and popular in organizations . Now a days data mining and hybrid approaches are getting priority combine with IDS in the area of anomalies and attack detection. In this paper, we focus on the designing a tool based on signature approach and the random forest algorithm for intrusion detection that offers data security and protection. Both algorithm works individually for IDS system but signature base algorithm have some limitations of known database requirement. In our research paper, we proposed a Hybrid intrusion detection model which allows us to double filtration of the intrusions in the application with implementation of combine signature and behavior based algorithm in one system. This paper addresses the various kinds of feature and the behavior of the threat and their different functioning further intrusion detection hybrid model is the extension for the simple individual model who work on either behavior or on signature.


2011 ◽  
Vol 267 ◽  
pp. 308-313 ◽  
Author(s):  
Shao Hong Zhong ◽  
Hua Jun Huang ◽  
Ai Bin Chen

This document explains and demonstrates how to prepare your camera-ready manuscript for Trans Tech Publications. The best is to read these instructions and follow the outline of this text. The text area for your manuscript must be 17 cm wide and 25 cm high (6.7 and 9.8 inches, resp.). Do not place any text outside this area. Use good quality, white paper of approximately 21 x 29 cm or 8 x 11 inches (please do not change the document setting from A4 to letter). Your manuscript will be reduced by approximately 20% by the publisher. Please keep this in mind when designing your figures and tables etc.Intrusion detection is a very important research domain in network security. Current intrusion detection systems (IDS) especially NIDS (Network Intrusion Detection System) examine all data features to detect intrusions. Also, many machine learning and data mining methods are utilized to fulfill intrusion detection tasks. This paper proposes an effective intrusion detection model that is computationally efficient and effective based on Random Forest based feature selection approach and Neural Networks (NN) model. We firstly utilize random forest method to select the most important features to eliminate the insignificant and/or useless inputs leads to a simplification of the problem, in order to faster and more accurate detection; Secondly, classic NN model is used to learn and detect intrusions using the selected important features. Experimental results on the well-known KDD 1999 dataset demonstrate the proposed hybrid model is actually effective.


Sign in / Sign up

Export Citation Format

Share Document