Problems of KDD Cup 99 Dataset Existed and Data Preprocessing

KDD Cup 99 dataset is not only the most widely used dataset in intrusion detection, but also the de facto benchmark on evaluating the performance merits of intrusion detection system. Nevertheless there are a lot of issues in this dataset which cannot be omitted. In order to establish good data mining models in intrusion detection and find the appropriate network intrusion attack types’ features, researchers should have a well-known understanding on this dataset. In this paper, first and foremost we have made an in-depth analysis on the problems which the dataset are existed, and given the related solutions. Secondly, we also have carried out plenty data preprocessing on the 10% subset of KDD Cup 99 dataset’s training set, giving better results to the following process. What’s more, by comparing 10 common kinds of data mining algorithms in our experiment, we have analyzed and summarized that data preprocessing plays a vital role on the performance and importance to data mining algorithms.

Download Full-text

A Performance Comparison of Data Mining Algorithms Based Intrusion Detection System for Smart Grid

2019 IEEE International Conference on Electro Information Technology (EIT) ◽

10.1109/eit.2019.8834255 ◽

2019 ◽

Author(s):

Zakaria El Mrabet ◽

Hassan El Ghazi ◽

Naima Kaabouch

Keyword(s):

Data Mining ◽

Intrusion Detection ◽

Smart Grid ◽

Intrusion Detection System ◽

Detection System ◽

Performance Comparison ◽

Data Mining Algorithms ◽

Mining Algorithms ◽

A Performance

Download Full-text

Diverse Analysis of Data Mining and Machine Learning Algorithms to Secure Computer Network

10.21203/rs.3.rs-305354/v1 ◽

2021 ◽

Author(s):

Neeraj Kumar ◽

Upendra Kumar

Keyword(s):

Machine Learning ◽

Data Mining ◽

Intrusion Detection ◽

Dimensionality Reduction ◽

Intrusion Detection System ◽

Detection System ◽

Machine Learning Algorithms ◽

Classification Techniques ◽

Network Intrusion ◽

Depth Analysis

Abstract Information and Communication Technologies, to a long extent, have a major influence on our social life, economy as well as on worldwide security. Holistically, computer networks embrace the Information Technology. Although the world is never free from people having malicious intents i.e. cyber criminals, network intruders etc. To counter this, Intrusion Detection System (IDS) plays a very significant role in identifying the network intrusions by performing various data analysis tasks. In order to develop robust IDS with accuracy in intrusion detection, various papers have been published over the years using different classification techniques of Data Mining (DM) and Machine Learning (ML) based hybrid approach. The present paper is an in-depth analysis of two focal aspects of Network Intrusion Detection System that includes various pre-processing methods in the form of dimensionality reduction and an assortment of classification techniques. This paper also includes comparative algorithmic analysis of DM and ML techniques, which applied to design an intelligent IDS. An experiment al comparative analysis has been carried out in support the verdicts of this work using ‘Python’ language on ‘kddcup99’ dataset as benchmark . Experimental analysis had been done in which we had found more impact on dimensionality reduction and MLP performed well in the true classification to establish secure network. The motive behind this effort is to detect different kinds of malware as early as possible with accuracy, to provide enhanced observant among various existing techniques that may help the fascinated researchers for future potential works.

Download Full-text

An efficient Intrusion detection system for identification from Suspicious URLs using data mining algorithms

International Journal of Business Intelligence and Data Mining ◽

10.1504/ijbidm.2017.10004362 ◽

2017 ◽

Vol 12 (2) ◽

pp. 1

Author(s):

K. Rajitha ◽

Doddapaneni Vijaya Lakshmi

Keyword(s):

Data Mining ◽

Intrusion Detection ◽

Intrusion Detection System ◽

Detection System ◽

Data Mining Algorithms ◽

Using Data ◽

Mining Algorithms

Download Full-text

An efficient intrusion detection system for identification from suspicious URLs using data mining algorithms

International Journal of Business Intelligence and Data Mining ◽

10.1504/ijbidm.2017.084284 ◽

2017 ◽

Vol 12 (2) ◽

pp. 133

Author(s):

Kotoju Rajitha ◽

Doddapaneni VijayaLakshmi

Keyword(s):

Data Mining ◽

Intrusion Detection ◽

Intrusion Detection System ◽

Detection System ◽

Data Mining Algorithms ◽

Using Data ◽

Mining Algorithms

Download Full-text

Using Data Mining Algorithms for Developing a Model for Intrusion Detection System (IDS)

Procedia Computer Science ◽

10.1016/j.procs.2015.09.145 ◽

2015 ◽

Vol 61 ◽

pp. 46-51 ◽

Cited By ~ 30

Author(s):

Solane Duque ◽

Mohd. Nizam bin Omar

Keyword(s):

Data Mining ◽

Intrusion Detection ◽

Intrusion Detection System ◽

Detection System ◽

Data Mining Algorithms ◽

Using Data ◽

Mining Algorithms

Download Full-text

Survey on Hybrid Data Mining Algorithms for Intrusion Detection System

Data Management, Analytics and Innovation - Advances in Intelligent Systems and Computing ◽

10.1007/978-981-13-1402-5_22 ◽

2018 ◽

pp. 291-298

Author(s):

Harshal N. Datir ◽

Pradip M. Jawandhiya

Keyword(s):

Data Mining ◽

Intrusion Detection ◽

Intrusion Detection System ◽

Detection System ◽

Data Mining Algorithms ◽

Hybrid Data ◽

Mining Algorithms

Download Full-text

Network intrusion detection: a comparative study of four classifiers using the NSL-KDD and KDD’99 datasets

Journal of Physics Conference Series ◽

10.1088/1742-6596/2161/1/012043 ◽

2022 ◽

Vol 2161 (1) ◽

pp. 012043

Author(s):

Ananya Devarakonda ◽

Nilesh Sharma ◽

Prita Saha ◽

S Ramya

Keyword(s):

Random Forest ◽

Intrusion Detection ◽

Intrusion Detection System ◽

Detection System ◽

Data Preprocessing ◽

Vital Role ◽

Network Activity ◽

Network Intrusion Detection ◽

Network Intrusion ◽

Network Intrusion Detection System

Abstract As most of the population acquires access to the internet, protecting online identity from threats of confidentiality, integrity, and accessibility becomes an increasingly important problem to tackle. By definition, a network intrusion detection system (IDS) helps pinpoint and identify anomalous network traffic to bring forward and classify suspicious activity. It is a fundamental part of network security and provides the first line of defense against a potential attack by alerting an administrator or appropriate personnel of possible malicious network activity. Several academic publications propose various artificial intelligence (AI) methods for an accurate network intrusion detection system (IDS). This paper outlines and compares four AI methods to train two benchmark datasets- the KDD’99 and the NSL-KDD. Apart from model selection, data preprocessing plays a vital role in contributing to accurate solutions, and thus, we propose a simple yet effective data preprocessing method. We also evaluate and compare the accuracy and performance of four popular models- decision tree (DT), multi-layer perceptron (MLP), random forest (RF), and a stacked autoencoder (SAE) model. Of the four methods, the random forest classifier showed the most consistent and accurate results.

Download Full-text

A Comparative Study of Data Mining Algorithms for Network Intrusion Detection

2008 First International Conference on Emerging Trends in Engineering and Technology ◽

10.1109/icetet.2008.80 ◽

2008 ◽

Cited By ~ 26

Author(s):

Mrutyunjaya Panda ◽

Manas Ranjan Patra

Keyword(s):

Data Mining ◽

Intrusion Detection ◽

Comparative Study ◽

Network Intrusion Detection ◽

Data Mining Algorithms ◽

Network Intrusion ◽

Mining Algorithms

Download Full-text

Comparing the Area of Data Mining Algorithms in Network Intrusion Detection

Journal of Information Security ◽

10.4236/jis.2020.111001 ◽

2020 ◽

Vol 11 (01) ◽

pp. 1-18

Author(s):

Yasamin Alagrash ◽

Azhar Drebee ◽

Nedda Zirjawi

Keyword(s):

Data Mining ◽

Intrusion Detection ◽

Network Intrusion Detection ◽

Data Mining Algorithms ◽

Network Intrusion ◽

Mining Algorithms

Download Full-text

Data Preprocessing for Network Intrusion Detection

Applied Mechanics and Materials ◽

10.4028/www.scientific.net/amm.20-23.867 ◽

2010 ◽

Vol 20-23 ◽

pp. 867-871

Author(s):

Li Li ◽

Ye Yuan

Keyword(s):

Intrusion Detection ◽

Categorical Data ◽

Intrusion Detection System ◽

Detection System ◽

Data Preprocessing ◽

Network Intrusion Detection ◽

Paper Briefly ◽

Network Intrusion ◽

The Neural Network ◽

Data Source

Most of IDS(Intrusion Detection System) are very particular about data source which might be asked to be categorical data or need to be correctly labeled. Therefore, the data preprocessing is an indispensable part in intrusion detecting. KDD Cpu 1999 Dataset is usually used for experimental data. This paper briefly introduces the features and the structure of the KDD Cpu 1999 Dataset and presents the method of the data preprocessing at Intrusion Detection System based on the neural network clustering’s algorithm.

Download Full-text