Performance Evaluation of Intrusion Detection System using Selected Features and Machine Learning Classifiers

Some of the main challenges in developing an effective network-based intrusion detection system (IDS) include analyzing large network traffic volumes and realizing the decision boundaries between normal and abnormal behaviors. Deploying feature selection together with efficient classifiers in the detection system can overcome these problems. Feature selection finds the most relevant features, thus reduces the dimensionality and complexity to analyze the network traffic. Moreover, using the most relevant features to build the predictive model, reduces the complexity of the developed model, thus reducing the building classifier model time and consequently improves the detection performance. In this study, two different sets of selected features have been adopted to train four machine-learning based classifiers. The two sets of selected features are based on Genetic Algorithm (GA) and Particle Swarm Optimization (PSO) approach respectively. These evolutionary-based algorithms are known to be effective in solving optimization problems. The classifiers used in this study are Naïve Bayes, k-Nearest Neighbor, Decision Tree and Support Vector Machine that have been trained and tested using the NSL-KDD dataset. The performance of the abovementioned classifiers using different features values was evaluated. The experimental results indicate that the detection accuracy improves by approximately 1.55% when implemented using the PSO-based selected features than that of using GA-based selected features. The Decision Tree classifier that was trained with PSO-based selected features outperformed other classifiers with accuracy, precision, recall, and f-score result of 99.38%, 99.36%, 99.32%, and 99.34% respectively. The results show that using optimal features coupling with a good classifier in a detection system able to reduce the classifier model building time, reduce the computational burden to analyze data, and consequently attain high detection rate.

Download Full-text

Anomaly-Based Intrusion Detection: Feature Selection and Normalization Influence to the Machine Learning Models Accuracy

European Journal of Engineering and Formal Sciences ◽

10.26417/ejef.v2i3.p101-106 ◽

2018 ◽

Vol 2 (3) ◽

pp. 101

Author(s):

Danijela Protić ◽

Miomir Stanković

Keyword(s):

Feature Selection ◽

Intrusion Detection ◽

Decision Tree ◽

Network Traffic ◽

Intrusion Detection System ◽

Nearest Neighbor ◽

Reference Model ◽

Detection System ◽

Support Vector ◽

K Nearest Neighbor

Anomaly-based intrusion detection system detects intrusion to the computer network based on a reference model that has to be able to identify its normal behavior and flag what is not normal. In this process network traffic is classified into two groups by adding different labels to normal and malicious behavior. Main disadvantage of anomaly-based intrusion detection system is necessity to learn the difference between normal and not normal. Another disadvantage is the complexity of datasets which simulate realistic network traffic. Feature selection and normalization can be used to reduce data complexity and decrease processing runtime by selecting a better feature space This paper presents the results of testing the influence of feature selection and instances normalization to the classification performances of k-nearest neighbor, weighted k-nearest neighbor, support vector machines and decision tree models on 10 days records of the Kyoto 2006+ dataset. The data was pre-processed to remove all categorical features from the dataset. The resulting subset contained 17 features. Features containing instances which could not be normalized into the range [-1, 1] have also been removed. The resulting subset consisted of nine features. The feature ‘Label’ categorized network traffic to two classes: normal (1) and malicious (0). The performance metric to evaluate models was accuracy. Proposed method resulted in very high accuracy values with Decision Tree giving highest values for not-normalized and with k-nearest neighbor giving highest values for normalized data.Keywords: feature selection, normalization, k-NN, weighted k-NN, SVM, decision tree, Kyoto 2006+

Download Full-text

Hybrid Intrusion Detection System Based on the Stacking Ensemble of C5 Decision Tree Classifier and One Class Support Vector Machine

Electronics ◽

10.3390/electronics9010173 ◽

2020 ◽

Vol 9 (1) ◽

pp. 173 ◽

Cited By ~ 9

Author(s):

Ansam Khraisat ◽

Iqbal Gondal ◽

Peter Vamplew ◽

Joarder Kamruzzaman ◽

Ammar Alazab

Keyword(s):

Support Vector Machine ◽

Intrusion Detection ◽

Decision Tree ◽

False Alarm ◽

Intrusion Detection System ◽

Detection System ◽

Support Vector ◽

Decision Tree Classifier ◽

Tree Classifier ◽

False Alarm Rates

Cyberttacks are becoming increasingly sophisticated, necessitating the efficient intrusion detection mechanisms to monitor computer resources and generate reports on anomalous or suspicious activities. Many Intrusion Detection Systems (IDSs) use a single classifier for identifying intrusions. Single classifier IDSs are unable to achieve high accuracy and low false alarm rates due to polymorphic, metamorphic, and zero-day behaviors of malware. In this paper, a Hybrid IDS (HIDS) is proposed by combining the C5 decision tree classifier and One Class Support Vector Machine (OC-SVM). HIDS combines the strengths of SIDS) and Anomaly-based Intrusion Detection System (AIDS). The SIDS was developed based on the C5.0 Decision tree classifier and AIDS was developed based on the one-class Support Vector Machine (SVM). This framework aims to identify both the well-known intrusions and zero-day attacks with high detection accuracy and low false-alarm rates. The proposed HIDS is evaluated using the benchmark datasets, namely, Network Security Laboratory-Knowledge Discovery in Databases (NSL-KDD) and Australian Defence Force Academy (ADFA) datasets. Studies show that the performance of HIDS is enhanced, compared to SIDS and AIDS in terms of detection rate and low false-alarm rates.

Download Full-text

An intelligent intrusion detection system using genetic based feature selection and Modified J48 decision tree classifier

2013 Fifth International Conference on Advanced Computing (ICoAC) ◽

10.1109/icoac.2013.6921918 ◽

2013 ◽

Cited By ~ 6

Author(s):

B. Senthilnayaki ◽

K. Venkatalakshmi ◽

A. Kannan

Keyword(s):

Feature Selection ◽

Intrusion Detection ◽

Decision Tree ◽

Intrusion Detection System ◽

Detection System ◽

Decision Tree Classifier ◽

Tree Classifier ◽

J48 Decision Tree

Download Full-text

Ensemble of Rule Learner and Sequential Minimum Optimization Algorithm for Intrusion Detection System

International Journal of Engineering and Advanced Technology - Regular Issue ◽

10.35940/ijeat.a9559.129219 ◽

2019 ◽

Vol 9 (2) ◽

pp. 501-506

Keyword(s):

Machine Learning ◽

Intrusion Detection ◽

Decision Tree ◽

Optimization Algorithm ◽

False Positive ◽

Classification Accuracy ◽

Intrusion Detection System ◽

Detection System ◽

Support Vector ◽

True Positive

An intrusion detection system is a process which automates analyzing activities in network or a computer system. It is used to detect nasty code, hateful activities, intruders and uninvited communications over the Internet. The general intrusion detection system is struggling with some problems like false positive rate, false negative rate, low classification accuracy and slow speed. Now-a-days, this has turned an attention of many researchers to handle these issues. Recently, ensemble of different base classifier is widely used to implement intrusion detection system. In ensemble method of machine learning, the proper selection of base classifier is a challenging task. In this paper, machine learning ensemble have designed and implemented for the intrusion detection system. The ensemble of Partial Decision Tree and Sequential Minimum optimization algorithm to train support vector machine have used for intrusion detection system. Partial Decision Tree rule learner is simplicity and it generates rules fast. Sequential Minimum optimization algorithm is easy to use and is better scaling with training set size with less computational time. Due to these advantages of both classifiers, they jointly used with different methods of ensemble. We make use of all types of methods of ensemble. The performances of base classifiers have evaluated in term of false positive, accuracy and true positive. Performance results display that proposed majority voting method of ensemble using Partial Decision Tree rule learner and Sequential Minimum optimization algorithm based Support Vector Machine offers highest classification among different ensemble classifiers on training dataset. This method of ensemble exhibits highest true positive and lowest false positive rates. It is also observed that stacking of both PART and SMO exhibits lowest and same classification accuracy on test dataset.

Download Full-text

Optimal feature selection for machine learning based intrusion detection system by exploiting attribute dependence

Materials Today Proceedings ◽

10.1016/j.matpr.2021.04.643 ◽

2021 ◽

Author(s):

Ghanshyam Prasad Dubey ◽

Dr. Rakesh Kumar Bhujade

Keyword(s):

Machine Learning ◽

Feature Selection ◽

Intrusion Detection ◽

Intrusion Detection System ◽

Detection System ◽

Optimal Feature Selection ◽

Selection For ◽

Optimal Feature

Download Full-text

An intelligent flow-based and signature-based IDS for SDNs using ensemble feature selection and a multi-layer machine learning-based classifier

Journal of Intelligent & Fuzzy Systems ◽

10.3233/jifs-200850 ◽

2020 ◽

pp. 1-20

Author(s):

K. Muthamil Sudar ◽

P. Deepalakshmi

Keyword(s):

Machine Learning ◽

Feature Selection ◽

Intrusion Detection ◽

Intrusion Detection System ◽

Network Architecture ◽

Detection System ◽

Software Defined Networking ◽

Control Plane ◽

Control Logic ◽

Data Plane

Software-defined networking is a new paradigm that overcomes problems associated with traditional network architecture by separating the control logic from data plane devices. It also enhances performance by providing a highly-programmable interface that adapts to dynamic changes in network policies. As software-defined networking controllers are prone to single-point failures, providing security is one of the biggest challenges in this framework. This paper intends to provide an intrusion detection mechanism in both the control plane and data plane to secure the controller and forwarding devices respectively. In the control plane, we imposed a flow-based intrusion detection system that inspects every new incoming flow towards the controller. In the data plane, we assigned a signature-based intrusion detection system to inspect traffic between Open Flow switches using port mirroring to analyse and detect malicious activity. Our flow-based system works with the help of trained, multi-layer machine learning-based classifier, while our signature-based system works with rule-based classifiers using the Snort intrusion detection system. The ensemble feature selection technique we adopted in the flow-based system helps to identify the prominent features and hasten the classification process. Our proposed work ensures a high level of security in the Software-defined networking environment by working simultaneously in both control plane and data plane.

Download Full-text

IntruDTree: A Machine Learning-Based Cyber Security Intrusion Detection Model

10.20944/preprints202004.0481.v1 ◽

2020 ◽

Author(s):

Iqbal H. Sarker ◽

Yoosef B. Abushark ◽

Fawaz Alsolami ◽

Asif Irshad Khan

Keyword(s):

Machine Learning ◽

Intrusion Detection ◽

Cyber Security ◽

Intrusion Detection System ◽

Detection System ◽

Machine Learning Techniques ◽

Support Vector ◽

Security Model ◽

K Nearest Neighbor ◽

Detection Model

Cyber security has recently received enormous attention in today’s security concerns, due to the popularity of the Internet-of-Things (IoT), the tremendous growth of computer networks, and the huge number of relevant applications. Thus, detecting various cyber-attacks or anomalies in a network and building an effective intrusion detection system that performs an essential role in today’s security is becoming more important. Artificial intelligence, particularly machine learning techniques, can be used for building such a data-driven intelligent intrusion detection system. In order to achieve this goal, in this paper, we present an Intrusion Detection Tree (“IntruDTree”) machine-learning-based security model that first takes into account the ranking of security features according to their importance and then build a tree-based generalized intrusion detection model based on the selected important features. This model is not only effective in terms of prediction accuracy for unseen test cases but also minimizes the computational complexity of the model by reducing the feature dimensions. Finally, the effectiveness of our IntruDTree model was examined by conducting experiments on cybersecurity datasets and computing the precision, recall, fscore, accuracy, and ROC values to evaluate. We also compare the outcome results of IntruDTree model with several traditional popular machine learning methods such as the naive Bayes classifier, logistic regression, support vector machines, and k-nearest neighbor, to analyze the effectiveness of the resulting security model.

Download Full-text

Scrutinizing Attacks and Evaluating Performance Appraisal Parameters via Feature Selection in Intrusion Detection System

10.21203/rs.3.rs-748765/v1 ◽

2021 ◽

Author(s):

Navroop Kaur ◽

Meenakshi Bansal ◽

Sukhwinder Singh S

Keyword(s):

Feature Selection ◽

Performance Evaluation ◽

Intrusion Detection ◽

Intrusion Detection System ◽

Detection System ◽

Denial Of Service ◽

Cyber Attacks ◽

Support Vector ◽

K Nearest Neighbor ◽

Evaluation Parameters

Abstract In modern times the firewall and antivirus packages are not good enough to protect the organization from numerous cyber attacks. Computer IDS (Intrusion Detection System) is a crucial aspect that contributes to the success of an organization. IDS is a software application responsible for scanning organization networks for suspicious activities and policy rupturing. IDS ensures the secure and reliable functioning of the network within an organization. IDS underwent huge transformations since its origin to cope up with the advancing computer crimes. The primary motive of IDS has been to augment the competence of detecting the attacks without endangering the performance of the network. The research paper elaborates on different types and different functions performed by the IDS. The NSL KDD dataset has been considered for training and testing. The seven prominent classifiers LR (Logistic Regression), NB (Naïve Bayes), DT (Decision Tree), AB (AdaBoost), RF (Random Forest), kNN (k Nearest Neighbor), and SVM (Support Vector Machine) have been studied along with their pros and cons and the feature selection have been imposed to enhance the reading of performance evaluation parameters (Accuracy, Precision, Recall, and F1Score). The paper elaborates a detailed flowchart and algorithm depicting the procedure to perform feature selection using XGB (Extreme Gradient Booster) for four categories of attacks: DoS (Denial of Service), Probe, R2L (Remote to Local Attack), and U2R (User to Root Attack). The selected features have been ranked as per their occurrence. The implementation have been conducted at five different ratios of 60-40%, 70-30%, 90-10%, 50-50%, and 80-20%. Different classifiers scored best for different performance evaluation parameters at different ratios. NB scored with the best Accuracy and Recall values. DT and RF consistently performed with high accuracy. NB, SVM, and kNN achieved good F1Score.

Download Full-text

Data Mining: A Bagged Decision Tree Classifier Algorithm For Ids Intrusion Detection System Based Attacks Classification

Design Engineering ◽

10.17762/de.v2021i04.1800 ◽

2021 ◽

pp. 1826-1839

Author(s):

Sandeep Adhikari, Dr. Sunita Chaudhary

Keyword(s):

Data Mining ◽

Intrusion Detection ◽

Decision Tree ◽

Intrusion Detection System ◽

Detection System ◽

Large Data ◽

Large Data Sets ◽

Data Sets ◽

Decision Tree Classifier ◽

Tree Classifier

The exponential growth in the use of computers over networks, as well as the proliferation of applications that operate on different platforms, has drawn attention to network security. This paradigm takes advantage of security flaws in all operating systems that are both technically difficult and costly to fix. As a result, intrusion is used as a key to worldwide a computer resource's credibility, availability, and confidentiality. The Intrusion Detection System (IDS) is critical in detecting network anomalies and attacks. In this paper, the data mining principle is combined with IDS to efficiently and quickly identify important, secret data of interest to the user. The proposed algorithm addresses four issues: data classification, high levels of human interaction, lack of labeled data, and the effectiveness of distributed denial of service attacks. We're also working on a decision tree classifier that has a variety of parameters. The previous algorithm classified IDS up to 90% of the time and was not appropriate for large data sets. Our proposed algorithm was designed to accurately classify large data sets. Aside from that, we quantify a few more decision tree classifier parameters.

Download Full-text

A Machine Learning Framework for Intrusion Detection System in IoT Networks Using an Ensemble Feature Selection Method

10.1109/iemcon53756.2021.9623082 ◽

2021 ◽

Author(s):

Ge Guo

Keyword(s):

Machine Learning ◽

Feature Selection ◽

Intrusion Detection ◽

Intrusion Detection System ◽

Detection System ◽

Feature Selection Method ◽

Selection Method ◽

Learning Framework

Download Full-text