Evaluation of Supervised Machine Learning Algorithms for Multi-class Intrusion Detection Systems

2021 ◽  
pp. 1-16
Author(s):  
Sanaa Kaddoura ◽  
Amal El Arid ◽  
Mirna Moukhtar
Information ◽  
2020 ◽  
Vol 11 (6) ◽  
pp. 315
Author(s):  
Nathan Martindale ◽  
Muhammad Ismail ◽  
Douglas A. Talbert

As new cyberattacks are launched against systems and networks on a daily basis, the ability for network intrusion detection systems to operate efficiently in the big data era has become critically important, particularly as more low-power Internet-of-Things (IoT) devices enter the market. This has motivated research in applying machine learning algorithms that can operate on streams of data, trained online or “live” on only a small amount of data kept in memory at a time, as opposed to the more classical approaches that are trained solely offline on all of the data at once. In this context, one important concept from machine learning for improving detection performance is the idea of “ensembles”, where a collection of machine learning algorithms are combined to compensate for their individual limitations and produce an overall superior algorithm. Unfortunately, existing research lacks proper performance comparison between homogeneous and heterogeneous online ensembles. Hence, this paper investigates several homogeneous and heterogeneous ensembles, proposes three novel online heterogeneous ensembles for intrusion detection, and compares their performance accuracy, run-time complexity, and response to concept drifts. Out of the proposed novel online ensembles, the heterogeneous ensemble consisting of an adaptive random forest of Hoeffding Trees combined with a Hoeffding Adaptive Tree performed the best, by dealing with concept drift in the most effective way. While this scheme is less accurate than a larger size adaptive random forest, it offered a marginally better run-time, which is beneficial for online training.


2021 ◽  
Vol 2021 ◽  
pp. 1-28
Author(s):  
Khalid M. Al-Gethami ◽  
Mousa T. Al-Akhras ◽  
Mohammed Alawairdhi

Optimizing the detection of intrusions is becoming more crucial due to the continuously rising rates and ferocity of cyber threats and attacks. One of the popular methods to optimize the accuracy of intrusion detection systems (IDSs) is by employing machine learning (ML) techniques. However, there are many factors that affect the accuracy of the ML-based IDSs. One of these factors is noise, which can be in the form of mislabelled instances, outliers, or extreme values. Determining the extent effect of noise helps to design and build more robust ML-based IDSs. This paper empirically examines the extent effect of noise on the accuracy of the ML-based IDSs by conducting a wide set of different experiments. The used ML algorithms are decision tree (DT), random forest (RF), support vector machine (SVM), artificial neural networks (ANNs), and Naïve Bayes (NB). In addition, the experiments are conducted on two widely used intrusion datasets, which are NSL-KDD and UNSW-NB15. Moreover, the paper also investigates the use of these ML algorithms as base classifiers with two ensembles of classifiers learning methods, which are bagging and boosting. The detailed results and findings are illustrated and discussed in this paper.


2021 ◽  
Vol 17 (10) ◽  
pp. 155014772110522
Author(s):  
Erman Özer ◽  
Murat İskefiyeli ◽  
Jahongir Azimjonov

Intrusion detection systems play a vital role in traffic flow monitoring on Internet of Things networks by providing a secure network traffic environment and blocking unwanted traffic packets. Various intrusion detection systems approaches have been proposed previously based on data mining, fuzzy techniques, genetic, neurogenetic, particle swarm intelligence, rough sets, and conventional machine learning. However, these methods are not energy efficient and do not perform accurately due to the inappropriate feature selection or the use of full features of datasets. In general, datasets contain more than 10 features. Any machine learning–based lightweight intrusion detection systems trained with full features turn into an inefficient and heavyweight intrusion detection systems. This case challenges Internet of Things networks that suffer from power efficiency problems. Therefore, lightweight (energy-efficient), accurate, and high-performance intrusion detection systems are paramount instead of inefficient and heavyweight intrusion detection systems. To address these challenges, a new approach that can help to determine the most effective and optimal feature pairs of datasets which enable the development of lightweight intrusion detection systems was proposed. For this purpose, 10 machine learning algorithms and the recent BoT-IoT (2018) dataset were selected. Twelve best features recommended by the developers of this dataset were used in this study. Sixty-six unique feature pairs were generated from the 12 best features. Next, 10 full-feature-based intrusion detection systems were developed by training the 10 machine learning algorithms with the 12 full features. Similarly, 660 feature-pair-based lightweight intrusion detection systems were developed by training the 10 machine learning algorithms via each feature pair out of the 66 feature pairs. Moreover, the 10 intrusion detection systems trained with 12 best features and the 660 intrusion detection systems trained via 66 feature pairs were compared to each other based on the machine learning algorithmic groups. Then, the feature-pair-based lightweight intrusion detection systems that achieved the accuracy level of the 10 full-feature-based intrusion detection systems were selected. This way, the optimal and efficient feature pairs and the lightweight intrusion detection systems were determined. The most lightweight intrusion detection systems achieved more than 90% detection accuracy.


Sign in / Sign up

Export Citation Format

Share Document