Analysis of Machine Learning Algorithms with Feature Selection for Intrusion Detection using UNSW-NB15 Dataset

Geeta Kocher; Gulshan Kumar

doi:10.5121/ijnsa.2021.13102

Analysis of Machine Learning Algorithms with Feature Selection for Intrusion Detection using UNSW-NB15 Dataset

International Journal of Network Security & Its Applications ◽

10.5121/ijnsa.2021.13102 ◽

2021 ◽

Vol 13 (1) ◽

pp. 21-31

Author(s):

Geeta Kocher ◽

Gulshan Kumar

Keyword(s):

Machine Learning ◽

Feature Selection ◽

Intrusion Detection ◽

Stochastic Gradient Descent ◽

Feature Selection Technique ◽

Selection Technique ◽

Machine Learning Classifiers ◽

Network Intrusion ◽

Learning Classifiers ◽

Positive Rate

In recent times, various machine learning classifiers are used to improve network intrusion detection. The researchers have proposed many solutions for intrusion detection in the literature. The machine learning classifiers are trained on older datasets for intrusion detection, which limits their detection accuracy. So, there is a need to train the machine learning classifiers on the latest dataset. In this paper, UNSW-NB15, the latest dataset is used to train machine learning classifiers. The selected classifiers such as K-Nearest Neighbors (KNN), Stochastic Gradient Descent (SGD), Random Forest (RF), Logistic Regression (LR), and Naïve Bayes (NB) classifiers are used for training from the taxonomy of classifiers based on lazy and eager learners. In this paper, Chi-Square, a filter-based feature selection technique, is applied to the UNSW-NB15 dataset to reduce the irrelevant and redundant features. The performance of classifiers is measured in terms of Accuracy, Mean Squared Error (MSE), Precision, Recall, F1-Score, True Positive Rate (TPR) and False Positive Rate (FPR) with or without feature selection technique and comparative analysis of these machine learning classifiers is carried out.

Download Full-text

Performance Analysis of Machine Learning Classifiers for Intrusion Detection using UNSW-NB15 Dataset

10.5121/csit.2020.102004 ◽

2020 ◽

Author(s):

Geeta Kocher ◽

Gulshan Kumar

Keyword(s):

Machine Learning ◽

Intrusion Detection ◽

Mean Squared Error ◽

Internet Technology ◽

Stochastic Gradient Descent ◽

Detection Accuracy ◽

Machine Learning Classifiers ◽

Learning Classifiers ◽

Positive Rate ◽

The Impact

With the advancement of internet technology, the numbers of threats are also rising exponentially. To reduce the impact of these threats, researchers have proposed many solutions for intrusion detection. In the literature, various machine learning classifiers are trained on older datasets for intrusion detection which limits their detection accuracy. So, there is a need to train the machine learning classifiers on latest dataset. In this paper, UNSW-NB15, the latest dataset is used to train machine learning classifiers. On the basis of theoretical analysis, taxonomy is proposed in terms of lazy and eager learners. From this proposed taxonomy, KNearest Neighbors (KNN), Stochastic Gradient Descent (SGD), Decision Tree (DT), Random Forest (RF), Logistic Regression (LR) and Naïve Bayes (NB) classifiers are selected for training. The performance of these classifiers is tested in terms of Accuracy, Mean Squared Error (MSE), Precision, Recall, F1-Score, True Positive Rate (TPR) and False Positive Rate (FPR) on UNSW-NB15 dataset and comparative analysis of these machine learning classifiers is carried out. The experimental results show that RF classifier outperforms other classifiers.

Download Full-text

Machine Learning Classifiers for Network Intrusion Detection System: Comparative Study

2021 International Conference on Information Technology (ICIT) ◽

10.1109/icit52682.2021.9491770 ◽

2021 ◽

Author(s):

Omar Almomani ◽

Mohammed Amin Almaiah ◽

Adeeb Alsaaidah ◽

Sami Smadi ◽

Adel Hamdan Mohammad ◽

...

Keyword(s):

Machine Learning ◽

Intrusion Detection ◽

Comparative Study ◽

Intrusion Detection System ◽

Detection System ◽

Network Intrusion Detection ◽

Machine Learning Classifiers ◽

Network Intrusion ◽

Network Intrusion Detection System ◽

Learning Classifiers

Download Full-text

Gradient Boosting Feature Selection with Machine Learning Classifiers for Intrusion Detection on Power Grids

IEEE Transactions on Network and Service Management ◽

10.1109/tnsm.2020.3032618 ◽

2020 ◽

pp. 1-1

Author(s):

Darshana Upadhyay ◽

Jaume Manero ◽

Marzia Zaman ◽

Srinivas Sampalli

Keyword(s):

Machine Learning ◽

Feature Selection ◽

Intrusion Detection ◽

Power Grids ◽

Gradient Boosting ◽

Machine Learning Classifiers ◽

Learning Classifiers

Download Full-text

CALIBRATION OF VARIOUS OPTIMIZED MACHINE LEARNING CLASSIFIERS IN NETWORK INTRUSION DETECTION SYSTEM ON THE REALISTIC CYBER DATASET CSE-CIC-IDS2018 USING CLOUD COMPUTING

International Journal of Engineering Applied Sciences and Technology ◽

10.33564/ijeast.2019.v04i06.036 ◽

2019 ◽

Vol 04 (06) ◽

pp. 209-213

Author(s):

V Kanimozhi ◽

Dr. T. Prem Jacob

Keyword(s):

Machine Learning ◽

Cloud Computing ◽

Intrusion Detection ◽

Intrusion Detection System ◽

Detection System ◽

Network Intrusion Detection ◽

Machine Learning Classifiers ◽

Network Intrusion ◽

Network Intrusion Detection System ◽

Learning Classifiers

Download Full-text

Multi-class SVM based network intrusion detection with attribute selection using infinite feature selection technique

Journal of Discrete Mathematical Sciences and Cryptography ◽

10.1080/09720529.2021.2009189 ◽

2021 ◽

Vol 24 (8) ◽

pp. 2137-2153

Author(s):

Ruchi Kaushik ◽

Vijander Singh ◽

Rajani Kumar

Keyword(s):

Feature Selection ◽

Intrusion Detection ◽

Attribute Selection ◽

Network Intrusion Detection ◽

Feature Selection Technique ◽

Selection Technique ◽

Network Intrusion

Download Full-text

Optimization of Network Intrusion Detection System Using Genetic Algorithm with Improved Feature Selection Technique

2019 IEEE 11th International Conference on Humanoid, Nanotechnology, Information Technology, Communication and Control, Environment, and Management ( HNICEM ) ◽

10.1109/hnicem48295.2019.9073439 ◽

2019 ◽

Cited By ~ 1

Author(s):

Elmer C. Matel ◽

Ariel M. Sison ◽

Ruji P. Medina

Keyword(s):

Genetic Algorithm ◽

Feature Selection ◽

Intrusion Detection ◽

Intrusion Detection System ◽

Detection System ◽

Network Intrusion Detection ◽

Feature Selection Technique ◽

Selection Technique ◽

Network Intrusion ◽

Network Intrusion Detection System

Download Full-text

A Fusion of Feature Extraction and Feature Selection Technique for Network Intrusion Detection

International Journal of Security and Its Applications ◽

10.14257/ijsia.2016.10.8.13 ◽

2016 ◽

Vol 10 (8) ◽

pp. 151-158 ◽

Cited By ~ 3

Author(s):

Yasir Hamid ◽

M. Sugumaran ◽

Ludovic Journaux

Keyword(s):

Feature Extraction ◽

Feature Selection ◽

Intrusion Detection ◽

Network Intrusion Detection ◽

Feature Selection Technique ◽

Selection Technique ◽

Network Intrusion

Download Full-text

Performance analysis of machine learning models for intrusion detection system using Gini Impurity-based Weighted Random Forest (GIWRF) feature selection technique

Cybersecurity ◽

10.1186/s42400-021-00103-8 ◽

2022 ◽

Vol 5 (1) ◽

Author(s):

Raisa Abedin Disha ◽

Sajjad Waheed

Keyword(s):

Machine Learning ◽

Feature Selection ◽

Random Forest ◽

Performance Analysis ◽

Intrusion Detection ◽

Intrusion Detection System ◽

Detection System ◽

Experimental Result ◽

Feature Selection Technique ◽

Selection Technique

AbstractTo protect the network, resources, and sensitive data, the intrusion detection system (IDS) has become a fundamental component of organizations that prevents cybercriminal activities. Several approaches have been introduced and implemented to thwart malicious activities so far. Due to the effectiveness of machine learning (ML) methods, the proposed approach applied several ML models for the intrusion detection system. In order to evaluate the performance of models, UNSW-NB 15 and Network TON_IoT datasets were used for offline analysis. Both datasets are comparatively newer than the NSL-KDD dataset to represent modern-day attacks. However, the performance analysis was carried out by training and testing the Decision Tree (DT), Gradient Boosting Tree (GBT), Multilayer Perceptron (MLP), AdaBoost, Long-Short Term Memory (LSTM), and Gated Recurrent Unit (GRU) for the binary classification task. As the performance of IDS deteriorates with a high dimensional feature vector, an optimum set of features was selected through a Gini Impurity-based Weighted Random Forest (GIWRF) model as the embedded feature selection technique. This technique employed Gini impurity as the splitting criterion of trees and adjusted the weights for two different classes of the imbalanced data to make the learning algorithm understand the class distribution. Based upon the importance score, 20 features were selected from UNSW-NB 15 and 10 features from the Network TON_IoT dataset. The experimental result revealed that DT performed well with the feature selection technique than other trained models of this experiment. Moreover, the proposed GIWRF-DT outperformed other existing methods surveyed in the literature in terms of the F1 score.

Download Full-text

A Comparative Study of Machine Learning Classifiers for Network Intrusion Detection

Lecture Notes in Computer Science - Artificial Intelligence and Security ◽

10.1007/978-3-030-24265-7_7 ◽

2019 ◽

pp. 75-86 ◽

Cited By ~ 1

Author(s):

Farrukh Aslam Khan ◽

Abdu Gumaei

Keyword(s):

Machine Learning ◽

Intrusion Detection ◽

Comparative Study ◽

Network Intrusion Detection ◽

Machine Learning Classifiers ◽

Network Intrusion ◽

Learning Classifiers

Download Full-text

FSDroid:- A feature selection technique to detect malware from Android using Machine Learning Techniques

Multimedia Tools and Applications ◽

10.1007/s11042-020-10367-w ◽

2021 ◽

Author(s):

Arvind Mahindru ◽

A.L. Sangal

Keyword(s):

Machine Learning ◽

Feature Selection ◽

Machine Learning Techniques ◽

Feature Selection Technique ◽

Selection Technique ◽

Learning Techniques

Download Full-text