Intrusion Detection System Combined Enhanced Random Forest With Smote Algorithm

Abstract Network security is subject to malicious attacks from multiple sources, and intrusion detection systems (IDS) play a key role in maintaining network security. During the training of intrusion detection models, the detection results generally have relatively large false detection rates due to the shortage of training data caused by data imbalance. To address the existing sample imbalance problem, this paper proposed a network intrusion detection algorithm based on enhanced random forest and Synthetic Minority Over-Sampling Technique (SMOTE) algorithm. Firstly, the method used a hybrid algorithm combining the K-means clustering algorithm with the SMOTE sampling algorithm to increase the number of minor samples and thus achieved a balanced data set, by which the sample features of minor samples could be learned more effectively. Secondly, preliminary prediction result was obtained by using enhanced random forest, and then the similarity matrix of network attacks was used to correct the prediction results of voting processing by the analysis of the type of network attacks. In this paper, the performance was tested using the NSL-KDD dataset with a classification accuracy of 99.72% on the training set and 78.47% on the test set. Compared with other related papers, our method has some improvement in the classification accuracy of detection.

Download Full-text

The Application of Fuzzy Clustering Number Algorithm in Network Intrusion Detection

Advanced Materials Research ◽

10.4028/www.scientific.net/amr.760-762.2220 ◽

2013 ◽

Vol 760-762 ◽

pp. 2220-2223

Author(s):

Lang Guo

Keyword(s):

Intrusion Detection ◽

Fuzzy Clustering ◽

Clustering Algorithm ◽

Local Optimum ◽

Cluster Number ◽

Data Set ◽

Network Intrusion ◽

Correlation Degree ◽

Indicator Data ◽

Detection Effect

In view of the defects of K-means algorithm in intrusion detection: the need of preassign cluster number and sensitive initial center and easy to fall into local optimum, this paper puts forward a fuzzy clustering algorithm. The fuzzy rules are utilized to express the invasion features, and standardized matrix is adopted to further process so as to reflect the approximation degree or correlation degree between the invasion indicator data and establish a similarity matrix. The simulation results of KDD CUP1999 data set show that the algorithm has better intrusion detection effect and can effectively detect the network intrusion data.

Download Full-text

Deep Stacking Network for Intrusion Detection

Sensors ◽

10.3390/s22010025 ◽

2021 ◽

Vol 22 (1) ◽

pp. 25

Author(s):

Yifan Tang ◽

Lize Gu ◽

Leiting Wang

Keyword(s):

Intrusion Detection ◽

Decision Tree ◽

Classification Accuracy ◽

Intrusion Detection System ◽

Detection System ◽

Detection Performance ◽

Network Intrusion Detection ◽

Fusion Model ◽

Data Set ◽

Network Intrusion

Preventing network intrusion is the essential requirement of network security. In recent years, people have conducted a lot of research on network intrusion detection systems. However, with the increasing number of advanced threat attacks, traditional intrusion detection mechanisms have defects and it is still indispensable to design a powerful intrusion detection system. This paper researches the NSL-KDD data set and analyzes the latest developments and existing problems in the field of intrusion detection technology. For unbalanced distribution and feature redundancy of the data set used for training, some training samples are under-sampling and feature selection processing. To improve the detection effect, a Deep Stacking Network model is proposed, which combines the classification results of multiple basic classifiers to improve the classification accuracy. In the experiment, we screened and compared the performance of various mainstream classifiers and found that the four models of the decision tree, k-nearest neighbors, deep neural network and random forests have outstanding detection performance and meet the needs of different classification effects. Among them, the classification accuracy of the decision tree reaches 86.1%. The classification effect of the Deeping Stacking Network, a fusion model composed of four classifiers, has been further improved and the accuracy reaches 86.8%. Compared with the intrusion detection system of other research papers, the proposed model effectively improves the detection performance and has made significant improvements in network intrusion detection.

Download Full-text

Network intrusion detection using oversampling technique and machine learning algorithms

PeerJ Computer Science ◽

10.7717/peerj-cs.820 ◽

2022 ◽

Vol 8 ◽

pp. e820

Author(s):

Hafiza Anisa Ahmed ◽

Anum Hameed ◽

Narmeen Zakaria Bawany

Keyword(s):

Machine Learning ◽

Random Forest ◽

Network Security ◽

Intrusion Detection ◽

Network Traffic ◽

Learning Algorithms ◽

Machine Learning Algorithms ◽

Network Intrusion Detection ◽

Classification Models ◽

Network Intrusion

The expeditious growth of the World Wide Web and the rampant flow of network traffic have resulted in a continuous increase of network security threats. Cyber attackers seek to exploit vulnerabilities in network architecture to steal valuable information or disrupt computer resources. Network Intrusion Detection System (NIDS) is used to effectively detect various attacks, thus providing timely protection to network resources from these attacks. To implement NIDS, a stream of supervised and unsupervised machine learning approaches is applied to detect irregularities in network traffic and to address network security issues. Such NIDSs are trained using various datasets that include attack traces. However, due to the advancement in modern-day attacks, these systems are unable to detect the emerging threats. Therefore, NIDS needs to be trained and developed with a modern comprehensive dataset which contains contemporary common and attack activities. This paper presents a framework in which different machine learning classification schemes are employed to detect various types of network attack categories. Five machine learning algorithms: Random Forest, Decision Tree, Logistic Regression, K-Nearest Neighbors and Artificial Neural Networks, are used for attack detection. This study uses a dataset published by the University of New South Wales (UNSW-NB15), a relatively new dataset that contains a large amount of network traffic data with nine categories of network attacks. The results show that the classification models achieved the highest accuracy of 89.29% by applying the Random Forest algorithm. Further improvement in the accuracy of classification models is observed when Synthetic Minority Oversampling Technique (SMOTE) is applied to address the class imbalance problem. After applying the SMOTE, the Random Forest classifier showed an accuracy of 95.1% with 24 selected features from the Principal Component Analysis method.

Download Full-text

SEMI-SUPERVISED DYNAMIC CLASSIFICATION FOR INTRUSION DETECTION

International Journal of Software Engineering and Knowledge Engineering ◽

10.1142/s0218194010004669 ◽

2010 ◽

Vol 20 (02) ◽

pp. 139-154 ◽

Cited By ~ 2

Author(s):

NEGAR KOOCHAKZADEH ◽

KEIVAN KIANMEHR ◽

JAMAL JIDA ◽

ILTAE LEE ◽

REDA ALHAJJ ◽

...

Keyword(s):

Intrusion Detection ◽

Test Data ◽

Clustering Algorithm ◽

Spatial Clustering ◽

Training Data ◽

Support Vector ◽

Data Set ◽

Real World Application ◽

Application Data ◽

New Framework

In this paper, a new framework to build an adaptive classifier is introduced. At first, a clustering algorithm, Density-Based Spatial Clustering of Applications with Noise (DBSCAN) is applied to a set of sample data to form initial set of clusters. The clusters are represented as classes. Using support vector machine (SVM), a classifier model is generated. In real world application, data comes in continuously. Therefore, if the model does not learn from the new data, the model may not perform as well with the new data especially when the model's training data is different from the test data. The new framework proposed in this paper rebuilds the classifier model using selected data from test data set to improve the accuracy of the model. A case study on intrusion detection data set has been performed to evaluate our methodology. The result shows that this approach lead to have more accurate classification models over time.

Download Full-text

Improving Intrusion Detection Model Prediction by Threshold Adaptation

Information ◽

10.3390/info10050159 ◽

2019 ◽

Vol 10 (5) ◽

pp. 159 ◽

Cited By ~ 5

Author(s):

Al Tobi ◽

Duncan

Keyword(s):

Random Forest ◽

Intrusion Detection ◽

Machine Learning Algorithms ◽

Support Vector ◽

Detection Rates ◽

Detection Model ◽

Network Intrusion ◽

Prospective Sampling ◽

Improving Accuracy ◽

High Level

Network traffic exhibits a high level of variability over short periods of time. This variability impacts negatively on the accuracy of anomaly-based network intrusion detection systems (IDS) that are built using predictive models in a batch learning setup. This work investigates how adapting the discriminating threshold of model predictions, specifically to the evaluated traffic, improves the detection rates of these intrusion detection models. Specifically, this research studied the adaptability features of three well known machine learning algorithms: C5.0, Random Forest and Support Vector Machine. Each algorithm’s ability to adapt their prediction thresholds was assessed and analysed under different scenarios that simulated real world settings using the prospective sampling approach. Multiple IDS datasets were used for the analysis, including a newly generated dataset (STA2018). This research demonstrated empirically the importance of threshold adaptation in improving the accuracy of detection models when training and evaluation traffic have different statistical properties. Tests were undertaken to analyse the effects of feature selection and data balancing on model accuracy when different significant features in traffic were used. The effects of threshold adaptation on improving accuracy were statistically analysed. Of the three compared algorithms, Random Forest was the most adaptable and had the highest detection rates.

Download Full-text

Intrusion Detection for Network Based on Elite Clone Artificial Bee Colony and Back Propagation Neural Network

Wireless Communications and Mobile Computing ◽

10.1155/2021/9956371 ◽

2021 ◽

Vol 2021 ◽

pp. 1-11

Author(s):

Guohong Qi ◽

Jie Zhou ◽

Wenxian Jia ◽

Menghan Liu ◽

Shengnan Zhang ◽

...

Keyword(s):

Network Security ◽

Intrusion Detection ◽

Artificial Bee Colony ◽

Back Propagation ◽

Back Propagation Neural Network ◽

Network Intrusion Detection ◽

Water Pipe ◽

Network Attacks ◽

Bee Colony ◽

Network Intrusion

With the rapid development of Internet technology, network attacks have become more frequent and complex, and intrusion detection has also played an increasingly important role in network security. Intrusion detection is real-time and proactive, and it is an indispensable technology under the diversified trend of network security issues. In terms of network security, neural networks have the characteristics of self-learning, self-adaptation, and parallel computing, which are very important in intrusion detection. This paper combines back propagation neural network (BPNN) and elite clone artificial bee colony (ECABC) to propose a new ECABC-BPNN, which updates and optimizes the settings of traditional BPNN weights and thresholds. Then, apply ECABC-BPNN to network intrusion detection. Use the attack data samples of KDD CUP 99 and water pipe for attack classification experiments using GA-BPNN, PSO-BPNN, and ECABC-BPNN. The results show that the ECABC-BPNN proposed in this paper has an accuracy rate of 98.08% on KDD 99 and 99.76% on water pipe data. ECABC-BPNN effectively improves the accuracy of network intrusion classification and reduces classification errors. In addition, the time complexity of using ECABC-BPNN to classify network attacks is relatively low. Therefore, ECABC-BPNN has superior performance in network intrusion detection and classification.

Download Full-text

Implementing a network intrusion detection system using semi-supervised support vector machine and random forest

Proceedings of the 2021 ACM Southeast Conference ◽

10.1145/3409334.3452073 ◽

2021 ◽

Author(s):

Sandeep Shah ◽

Pramita Sree Muhuri ◽

Xiaohong Yuan ◽

Kaushik Roy ◽

Prosenjit Chatterjee

Keyword(s):

Support Vector Machine ◽

Random Forest ◽

Intrusion Detection ◽

Intrusion Detection System ◽

Detection System ◽

Network Intrusion Detection ◽

Support Vector ◽

Network Intrusion ◽

Network Intrusion Detection System

Download Full-text

Intrusion Detection Based on Big Data Fuzzy Analytics

10.5772/intechopen.99636 ◽

2021 ◽

Author(s):

Farah Jemili ◽

Hajer Bouras

Keyword(s):

Big Data ◽

Network Security ◽

Intrusion Detection ◽

Intrusion Detection System ◽

High Performance ◽

Detection System ◽

Training Dataset ◽

False Alarms ◽

Massive Datasets ◽

Detection Rates

In today’s world, Intrusion Detection System (IDS) is one of the significant tools used to the improvement of network security, by detecting attacks or abnormal data accesses. Most of existing IDS have many disadvantages such as high false alarm rates and low detection rates. For the IDS, dealing with distributed and massive data constitutes a challenge. Besides, dealing with imprecise data is another challenge. This paper proposes an Intrusion Detection System based on big data fuzzy analytics; Fuzzy C-Means (FCM) method is used to cluster and classify the pre-processed training dataset. The CTU-13 and the UNSW-NB15 are used as distributed and massive datasets to prove the feasibility of the method. The proposed system shows high performance in terms of accuracy, precision, detection rates, and false alarms.

Download Full-text

The Effectiveness of Sampling Methods for the Imbalanced Network Intrusion Detection Data Set

Advances in Intelligent Systems and Computing - Recent Advances on Soft Computing and Data Mining ◽

10.1007/978-3-319-07692-8_58 ◽

2014 ◽

pp. 613-622 ◽

Cited By ~ 3

Author(s):

Kok-Chin Khor ◽

Choo-Yee Ting ◽

Somnuk Phon-Amnuaisuk

Keyword(s):

Intrusion Detection ◽

Sampling Methods ◽

Network Intrusion Detection ◽

Data Set ◽

Network Intrusion

Download Full-text

Enhance Network Intrusion Detection System by Exploiting BR Algorithm as an Optimal Feature Selection

Handbook of Research on Threat Detection and Countermeasures in Network Security - Advances in Information Security, Privacy, and Ethics ◽

10.4018/978-1-4666-6583-5.ch002 ◽

2015 ◽

pp. 17-32 ◽

Cited By ~ 1

Author(s):

Soukaena Hassan Hashem

Keyword(s):

Intrusion Detection ◽

Wireless Network ◽

Intrusion Detection System ◽

Missing Values ◽

Detection System ◽

Network Intrusion Detection ◽

Wireless Data ◽

Data Set ◽

Network Intrusion ◽

Network Intrusion Detection System

This chapter aims to build a proposed Wire/Wireless Network Intrusion Detection System (WWNIDS) to detect intrusions and consider many of modern attacks which are not taken in account previously. The proposal WWNIDS treat intrusion detection with just intrinsic features but not all of them. The dataset of WWNIDS will consist of two parts; first part will be wire network dataset which has been constructed from KDD'99 that has 41 features with some modifications to produce the proposed dataset that called modern KDD and to be reliable in detecting intrusion by suggesting three additional features. The second part will be building wireless network dataset by collecting thousands of sessions (normal and intrusion); this proposed dataset is called Constructed Wireless Data Set (CWDS). The preprocessing process will be done on the two datasets (KDD & CWDS) to eliminate some problems that affect the detection of intrusion such as noise, missing values and duplication.

Download Full-text