A review of KDD99 dataset usage in intrusion detection and machine learning between 2010 and 2015

10.7287/peerj.preprints.1954 ◽

2016 ◽

Cited By ~ 4

Author(s):

Atilla Özgür ◽

Hamit Erdem

Keyword(s):

Machine Learning ◽

Intrusion Detection ◽

Citation Index ◽

Academic Research ◽

Intrusion Detection Systems ◽

Classification Algorithms ◽

Detection Systems ◽

Research Areas ◽

Software Toolbox ◽

Wide Usage

Although KDD99 dataset is more than 15 years old, it is still widely used in academic research. To investigate wide usage of this dataset in Machine Learning Research (MLR) and Intrusion Detection Systems (IDS); this study reviews 149 research articles from 65 journals indexed in Science Citation In- dex Expanded and Emerging Sources Citation Index during the last six years (2010–2015). If we include papers presented in other indexes and conferences, number of studies would be tripled. The number of published studies shows that KDD99 is the most used dataset in IDS and machine learning areas, and it is the de facto dataset for these research areas. To show recent usage of KDD99 and the related sub-dataset (NSL-KDD) in IDS and MLR, the following de- scriptive statistics about the reviewed studies are given: main contribution of articles, the applied algorithms, compared classification algorithms, software toolbox usage, the size and type of the used dataset for training and test- ing, and classification output classes (binary, multi-class). In addition to these statistics, a checklist for future researchers that work in this area is provided.

Download Full-text

Machine Learning confronted with the operational constraints of detection systems

International Journal of Information Technology and Applied Sciences (IJITAS) ◽

10.52502/ijitas.v1i1.6 ◽

2019 ◽

Vol 1 (1) ◽

pp. 1-7

Author(s):

Sridarala Ramu ◽

Daniel Osaku

Keyword(s):

Machine Learning ◽

Intrusion Detection ◽

Academic Research ◽

Intrusion Detection Systems ◽

Machine Learning Techniques ◽

Detection Model ◽

Detection Systems ◽

Learning Techniques ◽

High Level ◽

Operational Constraints

Intrusion detection systems, traditionally based on signatures, have not escaped the recent appeal of machine learning techniques. While the results presented in academic research articles are often excellent, security experts still have many reservations about the use of Machine Learning in intrusion detection systems. They generally fear an inadequacy of these techniques to operational constraints, in particular because of a high level of expertise required, or a large number of false positives. In this article, we show that Machine Learning can be compatible with the operational constraints of detection systems. We explain how to build a detection model and present good practices to validate it before it goes into production. The methodology is illustrated by a case study on the detection of malicious PDF files and we offer a free tool, SecuML, to implement it.

Download Full-text

Launching Adversarial Attacks against Network Intrusion Detection Systems for IoT

Journal of Cybersecurity and Privacy ◽

10.3390/jcp1020014 ◽

2021 ◽

Vol 1 (2) ◽

pp. 252-273

Author(s):

Pavlos Papadopoulos ◽

Oliver Thornewill von Essen ◽

Nikolaos Pitropakis ◽

Christos Chrysoulas ◽

Alexios Mylonas ◽

...

Keyword(s):

Machine Learning ◽

Intrusion Detection ◽

Intrusion Detection Systems ◽

Learning Models ◽

Detection Systems ◽

Network Intrusion ◽

Robust Model ◽

Significant Probability ◽

Adversarial Examples ◽

Attack Surface

As the internet continues to be populated with new devices and emerging technologies, the attack surface grows exponentially. Technology is shifting towards a profit-driven Internet of Things market where security is an afterthought. Traditional defending approaches are no longer sufficient to detect both known and unknown attacks to high accuracy. Machine learning intrusion detection systems have proven their success in identifying unknown attacks with high precision. Nevertheless, machine learning models are also vulnerable to attacks. Adversarial examples can be used to evaluate the robustness of a designed model before it is deployed. Further, using adversarial examples is critical to creating a robust model designed for an adversarial environment. Our work evaluates both traditional machine learning and deep learning models’ robustness using the Bot-IoT dataset. Our methodology included two main approaches. First, label poisoning, used to cause incorrect classification by the model. Second, the fast gradient sign method, used to evade detection measures. The experiments demonstrated that an attacker could manipulate or circumvent detection with significant probability.

Download Full-text

Intrusion Detection Systems Based on Machine Learning Algorithms

2021 IEEE International Conference on Automatic Control & Intelligent Systems (I2CACIS) ◽

10.1109/i2cacis52118.2021.9495897 ◽

2021 ◽

Author(s):

Sandy Victor Amanoul ◽

Adnan Mohsin Abdulazeez ◽

Diyar Qader Zeebare ◽

Falah Y. H. Ahmed

Keyword(s):

Machine Learning ◽

Intrusion Detection ◽

Learning Algorithms ◽

Machine Learning Algorithms ◽

Intrusion Detection Systems ◽

Detection Systems

Download Full-text

Poisoning Attacks and Data Sanitization Mitigations for Machine Learning Models in Network Intrusion Detection Systems

10.1109/milcom52596.2021.9652916 ◽

2021 ◽

Author(s):

Sridhar Venkatesan ◽

Harshvardhan Sikka ◽

Rauf Izmailov ◽

Ritu Chadha ◽

Alina Oprea ◽

...

Keyword(s):

Machine Learning ◽

Intrusion Detection ◽

Intrusion Detection Systems ◽

Network Intrusion Detection ◽

Learning Models ◽

Detection Systems ◽

Data Sanitization ◽

Network Intrusion ◽

Network Intrusion Detection Systems ◽

Machine Learning Models

Download Full-text

A Review of Intrusion Detection Systems: Datasets and machine learning methods

10.1145/3454127.3456576 ◽

2021 ◽

Author(s):

Aouatif ARQANE ◽

Omar Boutkhoum ◽

Hicham Boukhriss ◽

Abdelmajid El Moutaouakkil

Keyword(s):

Machine Learning ◽

Intrusion Detection ◽

Intrusion Detection Systems ◽

Learning Methods ◽

Detection Systems ◽

Machine Learning Methods

Download Full-text

Applying Machine Learning Algorithms in Network-Based Intrusion Detection Systems

Lecture Notes in Electrical Engineering - Trends in Wireless Communication and Information Security ◽

10.1007/978-981-33-6393-9_24 ◽

2021 ◽

pp. 229-236

Author(s):

Nilesh Kumar Sahu ◽

Itu Snigdh

Keyword(s):

Machine Learning ◽

Intrusion Detection ◽

Learning Algorithms ◽

Machine Learning Algorithms ◽

Intrusion Detection Systems ◽

Detection Systems

Download Full-text

Application of Machine Learning Techniques in Intrusion Detection Systems: A Systematic Review

Advances in Intelligent Systems and Computing - Proceedings of Third International Conference on Sustainable Computing ◽

10.1007/978-981-16-4538-9_10 ◽

2022 ◽

pp. 97-105

Author(s):

Puneet Himthani ◽

Ghanshyam Prasad Dubey

Keyword(s):

Machine Learning ◽

Systematic Review ◽

Intrusion Detection ◽

Intrusion Detection Systems ◽

Machine Learning Techniques ◽

Detection Systems ◽

Learning Techniques

Download Full-text

Ensemble-Based Online Machine Learning Algorithms for Network Intrusion Detection Systems Using Streaming Data

Information ◽

10.3390/info11060315 ◽

2020 ◽

Vol 11 (6) ◽

pp. 315

Author(s):

Nathan Martindale ◽

Muhammad Ismail ◽

Douglas A. Talbert

Keyword(s):

Machine Learning ◽

Random Forest ◽

Intrusion Detection ◽

Learning Algorithms ◽

Machine Learning Algorithms ◽

Intrusion Detection Systems ◽

Network Intrusion Detection ◽

Detection Systems ◽

Network Intrusion ◽

Network Intrusion Detection Systems

As new cyberattacks are launched against systems and networks on a daily basis, the ability for network intrusion detection systems to operate efficiently in the big data era has become critically important, particularly as more low-power Internet-of-Things (IoT) devices enter the market. This has motivated research in applying machine learning algorithms that can operate on streams of data, trained online or “live” on only a small amount of data kept in memory at a time, as opposed to the more classical approaches that are trained solely offline on all of the data at once. In this context, one important concept from machine learning for improving detection performance is the idea of “ensembles”, where a collection of machine learning algorithms are combined to compensate for their individual limitations and produce an overall superior algorithm. Unfortunately, existing research lacks proper performance comparison between homogeneous and heterogeneous online ensembles. Hence, this paper investigates several homogeneous and heterogeneous ensembles, proposes three novel online heterogeneous ensembles for intrusion detection, and compares their performance accuracy, run-time complexity, and response to concept drifts. Out of the proposed novel online ensembles, the heterogeneous ensemble consisting of an adaptive random forest of Hoeffding Trees combined with a Hoeffding Adaptive Tree performed the best, by dealing with concept drift in the most effective way. While this scheme is less accurate than a larger size adaptive random forest, it offered a marginally better run-time, which is beneficial for online training.

Download Full-text

Intrusion Detection Using Machine Learning

Dynamic and Advanced Data Mining for Progressing Technological Development ◽

10.4018/978-1-60566-908-3.ch005 ◽

2010 ◽

pp. 70-107

Author(s):

Mohammed M. Mazid ◽

A. B.M. Shawkat Ali ◽

Kevin S. Tickle

Keyword(s):

Intrusion Detection ◽

Medical Diagnosis ◽

Computer Network ◽

Model Identification ◽

Intrusion Detection Systems ◽

Network Technology ◽

Rule Based ◽

Detection Systems ◽

Research Areas ◽

Future Direction

Intrusion detection has received enormous attention from the beginning of computer network technology. It is the task of detecting attacks against a network and its resources. To detect and counteract any unauthorized activity, it is desirable for network and system administrators to monitor the activities in their network. Over the last few years a number of intrusion detection systems have been developed and are in use for commercial and academic institutes. But still there have some challenges to be solved. This chapter will provide the review, demonstration and future direction on intrusion detection. The authors’ emphasis on Intrusion Detection is various kinds of rule based techniques. The research aims are also to summarize the effectiveness and limitation of intrusion detection technologies in the medical diagnosis, control and model identification in engineering, decision making in marketing and finance, web and text mining, and some other research areas.

Download Full-text