Data-driven network intrusion detection (NID) has a tendency towards minority attack classes compared to normal traffic. Many datasets are collected in simulated environments rather than real-world networks. These challenges undermine the performance of intrusion detection machine learning models by fitting machine learning models to unrepresentative “sandbox” datasets. This survey presents a taxonomy with eight main challenges and explores common datasets from 1999 to 2020. Trends are analyzed on the challenges in the past decade and future directions are proposed on expanding NID into cloud-based environments, devising scalable models for large network data, and creating labeled datasets collected in real-world networks.
The use of deep learning in various models is a powerful tool in detecting IoT attacks, identifying new types of intrusion to access a better secure network. Need to developing an intrusion detection system to detect and classify attacks in appropriate time and automated manner increases especially due to the use of IoT and the nature of its data that causes increasing in attacks. Malicious attacks are continuously changing, that cause new attacks. In this paper we present a survey about the detection of anomalies, thus intrusion detection by distinguishing between normal behavior and malicious behavior while analyzing network traffic to discover new attacks. This paper surveys previous researches by evaluating their performance through two categories of new datasets of real traffic are (CSE-CIC-IDS2018 dataset, Bot-IoT dataset). To evaluate the performance we show accuracy measurement for detect intrusion in different systems.
Signature-based Intrusion Detection Systems (SIDS) play a crucial role within the arsenal of security components of most organizations. They can find traces of known attacks in the network traffic or host events for which patterns or signatures have been pre-established. SIDS include standard packages of detection rulesets, but only those rules suited to the operational environment should be activated for optimal performance. However, some organizations might skip this tuning process and instead activate default off-the-shelf rulesets without understanding its implications and trade-offs. In this work, we help gain insight into the consequences of using predefined rulesets in the performance of SIDS. We experimentally explore the performance of three SIDS in the context of web attacks. In particular, we gauge the detection rate obtained with predefined subsets of rules for Snort, ModSecurity and Nemesida using seven attack datasets. We also determine the precision and rate of alert generated by each detector in a real-life case using a large trace from a public webserver. Results show that the maximum detection rate achieved by the SIDS under test is insufficient to protect systems effectively and is lower than expected for known attacks. Our results also indicate that the choice of predefined settings activated on each detector strongly influences its detection capability and false alarm rate. Snort and ModSecurity scored either a very poor detection rate (activating the less-sensitive predefined ruleset) or a very poor precision (activating the full ruleset). We also found that using various SIDS for a cooperative decision can improve the precision or the detection rate, but not both. Consequently, it is necessary to reflect upon the role of these open-source SIDS with default configurations as core elements for protection in the context of web attacks. Finally, we provide an efficient method for systematically determining which rules deactivate from a ruleset to significantly reduce the false alarm rate for a target operational environment. We tested our approach using Snort’s ruleset in our real-life trace, increasing the precision from 0.015 to 1 in less than 16 h of work.
The complexity of network intrusion detection systems (IDSs) is increasing due to the continuous increases in network traffic, various attacks and the ever-changing network environment. In addition, network traffic is asymmetric with few attack data, but the attack data are so complex that it is difficult to detect one. Many studies on improving intrusion detection performance using feature engineering have been conducted. These studies work well in the dataset environment; however, it is challenging to cope with a changing network environment. This paper proposes an intrusion detection hyperparameter control system (IDHCS) that controls and trains a deep neural network (DNN) feature extractor and k-means clustering module as a reinforcement learning model based on proximal policy optimization (PPO). An IDHCS controls the DNN feature extractor to extract the most valuable features in the network environment, and identifies intrusion through k-means clustering. Through iterative learning using the PPO-based reinforcement learning model, the system is optimized to improve performance automatically according to the network environment, where the IDHCS is used. Experiments were conducted to evaluate the system performance using the CICIDS2017 and UNSW-NB15 datasets. In CICIDS2017, an F1-score of 0.96552 was achieved and UNSW-NB15 achieved an F1-score of 0.94268. An experiment was conducted by merging the two datasets to build a more extensive and complex test environment. By merging datasets, the attack types in the experiment became more diverse and their patterns became more complex. An F1-score of 0.93567 was achieved in the merged dataset, indicating 97% to 99% performance compared with CICIDS2017 and UNSW-NB15. The results reveal that the proposed IDHCS improved the performance of the IDS by automating learning new types of attacks by managing intrusion detection features regardless of the network environment changes through continuous learning.