Empirical Evaluation of Noise Influence on Supervised Machine Learning Algorithms Using Intrusion Detection Datasets

Optimizing the detection of intrusions is becoming more crucial due to the continuously rising rates and ferocity of cyber threats and attacks. One of the popular methods to optimize the accuracy of intrusion detection systems (IDSs) is by employing machine learning (ML) techniques. However, there are many factors that affect the accuracy of the ML-based IDSs. One of these factors is noise, which can be in the form of mislabelled instances, outliers, or extreme values. Determining the extent effect of noise helps to design and build more robust ML-based IDSs. This paper empirically examines the extent effect of noise on the accuracy of the ML-based IDSs by conducting a wide set of different experiments. The used ML algorithms are decision tree (DT), random forest (RF), support vector machine (SVM), artificial neural networks (ANNs), and Naïve Bayes (NB). In addition, the experiments are conducted on two widely used intrusion datasets, which are NSL-KDD and UNSW-NB15. Moreover, the paper also investigates the use of these ML algorithms as base classifiers with two ensembles of classifiers learning methods, which are bagging and boosting. The detailed results and findings are illustrated and discussed in this paper.

Download Full-text

Evaluation of Supervised Machine Learning Algorithms for Multi-class Intrusion Detection Systems

10.1007/978-3-030-89912-7_1 ◽

2021 ◽

pp. 1-16

Author(s):

Sanaa Kaddoura ◽

Amal El Arid ◽

Mirna Moukhtar

Keyword(s):

Machine Learning ◽

Intrusion Detection ◽

Learning Algorithms ◽

Machine Learning Algorithms ◽

Supervised Machine Learning ◽

Intrusion Detection Systems ◽

Detection Systems

Download Full-text

ANOMALY DETECTION USING MACHINE LEARNING APPROACHES

Azerbaijan Journal of High Performance Computing ◽

10.32010/26166127.2020.3.2.196.206 ◽

2020 ◽

Vol 3 (2) ◽

pp. 196-206

Author(s):

Mausumi Das Nath ◽

◽

Tapalina Bhattasali

Keyword(s):

Machine Learning ◽

Intrusion Detection ◽

Vital Role ◽

Machine Learning Algorithms ◽

Intrusion Detection Systems ◽

Abnormal Behavior ◽

Support Vector ◽

Learning Approaches ◽

Detection Systems ◽

Internet Users

Due to the enormous usage of the Internet, users share resources and exchange voluminous amounts of data. This increases the high risk of data theft and other types of attacks. Network security plays a vital role in protecting the electronic exchange of data and attempts to avoid disruption concerning finances or disrupted services due to the unknown proliferations in the network. Many Intrusion Detection Systems (IDS) are commonly used to detect such unknown attacks and unauthorized access in a network. Many approaches have been put forward by the researchers which showed satisfactory results in intrusion detection systems significantly which ranged from various traditional approaches to Artificial Intelligence (AI) based approaches.AI based techniques have gained an edge over other statistical techniques in the research community due to its enormous benefits. Procedures can be designed to display behavior learned from previous experiences. Machine learning algorithms are used to analyze the abnormal instances in a particular network. Supervised learning is essential in terms of training and analyzing the abnormal behavior in a network. In this paper, we propose a model of Naïve Bayes and SVM (Support Vector Machine) to detect anomalies and an ensemble approach to solve the weaknesses and to remove the poor detection results

Download Full-text

A Machine Learning Approach for Improving the Performance of Network Intrusion Detection Systems

Annals of Emerging Technologies in Computing ◽

10.33166/aetic.2021.05.025 ◽

2021 ◽

Vol 5 (5) ◽

pp. 201-208

Author(s):

Adnan Helmi Azizan ◽

Salama A. Mostafa ◽

Aida Mustapha ◽

Cik Feresa Mohd Foozy ◽

Mohd Helmy Abd Wahab ◽

...

Keyword(s):

Machine Learning ◽

Intrusion Detection ◽

Machine Learning Algorithms ◽

Intrusion Detection Systems ◽

Knowledge Discovery In Databases ◽

Network Intrusion Detection ◽

Support Vector ◽

Average Precision ◽

Detection Systems ◽

Network Intrusion

Intrusion detection systems (IDS) are used in analyzing huge data and diagnose anomaly traffic such as DDoS attack; thus, an efficient traffic classification method is necessary for the IDS. The IDS models attempt to decrease false alarm and increase true alarm rates in order to improve the performance accuracy of the system. To resolve this concern, three machine learning algorithms have been tested and evaluated in this research which are decision jungle (DJ), random forest (RF) and support vector machine (SVM). The main objective is to propose a ML-based network intrusion detection system (ML-based NIDS) model that compares the performance of the three algorithms based on their accuracy and precision of anomaly traffics. The knowledge discovery in databases (KDD) methodology and intrusion detection evaluation dataset (CIC-IDS2017) are used in the testing which both are considered as a benchmark in the evaluation of IDS. The average accuracy results of the SVM is 98.18%, RF is 96.76% and DJ is 96.50% in which the highest accuracy is achieved by the SVM. The average precision results of the SVM is 98.74, RF is 97.96 and DJ is 97.82 in which the SVM got a higher average precision compared with the other two algorithms. The average recall results of the SVM is 95.63, RF is 97.62 and DJ is 95.77 in which the RF achieves the highest average of recall than SVM and DJ. In overall, the SVM algorithm is found to be the best algorithm that can be used to detect an intrusion in the system.

Download Full-text

Intrusion Detection Systems Based on Machine Learning Algorithms

2021 IEEE International Conference on Automatic Control & Intelligent Systems (I2CACIS) ◽

10.1109/i2cacis52118.2021.9495897 ◽

2021 ◽

Author(s):

Sandy Victor Amanoul ◽

Adnan Mohsin Abdulazeez ◽

Diyar Qader Zeebare ◽

Falah Y. H. Ahmed

Keyword(s):

Machine Learning ◽

Intrusion Detection ◽

Learning Algorithms ◽

Machine Learning Algorithms ◽

Intrusion Detection Systems ◽

Detection Systems

Download Full-text

Applying Machine Learning Algorithms in Network-Based Intrusion Detection Systems

Lecture Notes in Electrical Engineering - Trends in Wireless Communication and Information Security ◽

10.1007/978-981-33-6393-9_24 ◽

2021 ◽

pp. 229-236

Author(s):

Nilesh Kumar Sahu ◽

Itu Snigdh

Keyword(s):

Machine Learning ◽

Intrusion Detection ◽

Learning Algorithms ◽

Machine Learning Algorithms ◽

Intrusion Detection Systems ◽

Detection Systems

Download Full-text

Ensemble-Based Online Machine Learning Algorithms for Network Intrusion Detection Systems Using Streaming Data

Information ◽

10.3390/info11060315 ◽

2020 ◽

Vol 11 (6) ◽

pp. 315

Author(s):

Nathan Martindale ◽

Muhammad Ismail ◽

Douglas A. Talbert

Keyword(s):

Machine Learning ◽

Random Forest ◽

Intrusion Detection ◽

Learning Algorithms ◽

Machine Learning Algorithms ◽

Intrusion Detection Systems ◽

Network Intrusion Detection ◽

Detection Systems ◽

Network Intrusion ◽

Network Intrusion Detection Systems

As new cyberattacks are launched against systems and networks on a daily basis, the ability for network intrusion detection systems to operate efficiently in the big data era has become critically important, particularly as more low-power Internet-of-Things (IoT) devices enter the market. This has motivated research in applying machine learning algorithms that can operate on streams of data, trained online or “live” on only a small amount of data kept in memory at a time, as opposed to the more classical approaches that are trained solely offline on all of the data at once. In this context, one important concept from machine learning for improving detection performance is the idea of “ensembles”, where a collection of machine learning algorithms are combined to compensate for their individual limitations and produce an overall superior algorithm. Unfortunately, existing research lacks proper performance comparison between homogeneous and heterogeneous online ensembles. Hence, this paper investigates several homogeneous and heterogeneous ensembles, proposes three novel online heterogeneous ensembles for intrusion detection, and compares their performance accuracy, run-time complexity, and response to concept drifts. Out of the proposed novel online ensembles, the heterogeneous ensemble consisting of an adaptive random forest of Hoeffding Trees combined with a Hoeffding Adaptive Tree performed the best, by dealing with concept drift in the most effective way. While this scheme is less accurate than a larger size adaptive random forest, it offered a marginally better run-time, which is beneficial for online training.

Download Full-text

Taxonomy of Supervised Machine Learning for Intrusion Detection Systems

Strategic Innovative Marketing and Tourism - Springer Proceedings in Business and Economics ◽

10.1007/978-3-030-36126-6_69 ◽

2020 ◽

pp. 619-628

Author(s):

Ahmed Ahmim ◽

Mohamed Amine Ferrag ◽

Leandros Maglaras ◽

Makhlouf Derdour ◽

Helge Janicke ◽

...

Keyword(s):

Machine Learning ◽

Intrusion Detection ◽

Supervised Machine Learning ◽

Intrusion Detection Systems ◽

Detection Systems

Download Full-text

Performance Evaluation of Container-Level Anomaly-Based Intrusion Detection Systems for Multi-Tenant Applications Using Machine Learning Algorithms

The 16th International Conference on Availability, Reliability and Security ◽

10.1145/3465481.3470066 ◽

2021 ◽

Author(s):

Marcos Cavalcanti ◽

Pedro Inacio ◽

Mario Freire

Keyword(s):

Machine Learning ◽

Performance Evaluation ◽

Intrusion Detection ◽

Learning Algorithms ◽

Machine Learning Algorithms ◽

Intrusion Detection Systems ◽

Detection Systems

Download Full-text

A Detailed Description on Unsupervised Heterogeneous Anomaly Based Intrusion Detection Framework

Scalable Computing Practice and Experience ◽

10.12694/scpe.v20i1.1465 ◽

2019 ◽

Vol 20 (1) ◽

pp. 113-160 ◽

Cited By ~ 2

Author(s):

Asif Iqbal Hajamydeen ◽

Nur Izura Udzir

Keyword(s):

Machine Learning ◽

Data Mining ◽

Intrusion Detection ◽

Machine Learning Algorithms ◽

Intrusion Detection Systems ◽

Feature Analysis ◽

Multiple Sources ◽

Detection Systems ◽

A Current ◽

Current Segment

Observing network traffic flow for anomalies is a common method in Intrusion Detection. More effort has been taken in utilizing the data mining and machine learning algorithms to construct anomaly based intrusion detection systems, but the dependency on the learned models that were built based on earlier network behaviour still exists, which restricts those methods in detecting new or unknown intrusions. Consequently, this investigation proposes a structure to identify an extensive variety of abnormalities by analysing heterogeneous logs, without utilizing either a prepared model of system transactions or the attributes of anomalies. To accomplish this, a current segment (clustering) has been used and a few new parts (filtering, aggregating and feature analysis) have been presented. Several logs from multiple sources are used as input and this data are processed by all the modules of the framework. As each segment is instrumented for a particular undertaking towards a definitive objective, the commitment of each segment towards abnormality recognition is estimated with various execution measurements. Ultimately, the framework is able to detect a broad range of intrusions exist in the logs without using either the attack knowledge or the traffic behavioural models. The result achieved shows the direction or pathway to design anomaly detectors that can utilize raw traffic logs collected from heterogeneous sources on the network monitored and correlate the events across the logs to detect intrusions.

Download Full-text