scholarly journals Machine Learning Based Network Anomaly Detection

Network Anomaly Detection Systems (NADSs) play prominent role in network security. Due to dynamic change of malware in network traffic data, traditional tools and techniques are failing to protect networks from attack penetration. In this paper we propose a two-phase model to detect and categorize anomalies. First, we selected Random Forest based on the highest accuracy-score out of eleven commonly used algorithms tested with the same set of data. The RF is used to detect anomalies and generate an extra feature named “attack-or-not”. Secondly we fed Neural Network with the data having “attack-or-not” feature to differentiate attack categories, which will help treating each type accordingly. The model performance was good, it scored 0.99 for both Precision and Recall in anomaly detection phase and 0.93 for Precision and 0.88 for Recall in attack categorization phase. We used UNSW-NB15 data set in our study.

Author(s):  
Ramesh Paudel ◽  
Lauren Tharp ◽  
Dulce Kaiser ◽  
William Eberle ◽  
Gerald Gannod

Network protocol analyzers such asWireshark are valuable for analyzing network traffic but pose a challenge in that it can be difficult to determine which behaviors are out of the ordinary due to the volume of data that must be analyzed. Network anomaly detection systems can provide vital insights to security analysts to supplement protocol analyzers, but this feedback can be difficult to interpret due to the complexity of the algorithms used and the lack of context to determine the reasoning for which an event was labeled as anomalous. We present an approach for visualizing anomalies using a graph-based anomaly detection methodology that aims to provide visual context to network traffic. We demonstrate the approach using network traffic flows as an approach for aiding in the investigation and triage of anomalous network events. The simplicity of a visual representation supports fast analysis of anomalous traffic to identify true positives from false positives and prevent further potential damage.


Author(s):  
Yirui Hu

This chapter is an introduction to multi-cluster based anomaly detection analysis. Various anomalies present different behaviors in wireless networks. Not all anomalies are known to networks. Unsupervised algorithms are desirable to automatically characterize the nature of traffic behavior and detect anomalies from normal behaviors. Essentially all anomaly detection systems first learn a model of the normal patterns in training data set, and then determine the anomaly score of a given testing data point based on the deviations from the learned patterns. The initial step of learning a good model is the most crucial part in anomaly detection. Multi-cluster based analysis are valuable because they can obtain the insights of human behaviors and learn similar patterns in temporal traffic data. The anomaly threshold can be determined by quantitative analysis based on the trained model. A novel quantitative “Donut” algorithm of anomaly detection on the basis of model log-likelihood is proposed in this chapter.


2009 ◽  
Vol 7 (1) ◽  
pp. 63-81 ◽  
Author(s):  
Ayesha Binte Ashfaq ◽  
Muhammad Qasim Ali ◽  
Syed Ali Khayam

2021 ◽  
Vol 8 (1) ◽  
Author(s):  
Nouar AlDahoul ◽  
Hezerul Abdul Karim ◽  
Abdulaziz Saleh Ba Wazir

AbstractNetwork Anomaly Detection is still an open challenging task that aims to detect anomalous network traffic for security purposes. Usually, the network traffic data are large-scale and imbalanced. Additionally, they have noisy labels. This paper addresses the previous challenges and utilizes million-scale and highly imbalanced ZYELL’s dataset. We propose to train deep neural networks with class weight optimization to learn complex patterns from rare anomalies observed from the traffic data. This paper proposes a novel model fusion that combines two deep neural networks including binary normal/attack classifier and multi-attacks classifier. The proposed solution can detect various network attacks such as Distributed Denial of Service (DDOS), IP probing, PORT probing, and Network Mapper (NMAP) probing. The experiments conducted on a ZYELL’s real-world dataset show promising performance. It was found that the proposed approach outperformed the baseline model in terms of average macro Fβ score and false alarm rate by 17% and 5.3%, respectively.


2021 ◽  
Vol 12 (3) ◽  
pp. 1-28
Author(s):  
Makiya Nakashima ◽  
Alex Sim ◽  
Youngsoo Kim ◽  
Jonghyun Kim ◽  
Jinoh Kim

Variable selection (also known as feature selection ) is essential to optimize the learning complexity by prioritizing features, particularly for a massive, high-dimensional dataset like network traffic data. In reality, however, it is not an easy task to effectively perform the feature selection despite the availability of the existing selection techniques. From our initial experiments, we observed that the existing selection techniques produce different sets of features even under the same condition (e.g., a static size for the resulted set). In addition, individual selection techniques perform inconsistently, sometimes showing better performance but sometimes worse than others, thereby simply relying on one of them would be risky for building models using the selected features. More critically, it is demanding to automate the selection process, since it requires laborious efforts with intensive analysis by a group of experts otherwise. In this article, we explore challenges in the automated feature selection with the application of network anomaly detection. We first present our ensemble approach that benefits from the existing feature selection techniques by incorporating them, and one of the proposed ensemble techniques based on greedy search works highly consistently showing comparable results to the existing techniques. We also address the problem of when to stop to finalize the feature elimination process and present a set of methods designed to determine the number of features for the reduced feature set. Our experimental results conducted with two recent network datasets show that the identified feature sets by the presented ensemble and stopping methods consistently yield comparable performance with a smaller number of features to conventional selection techniques.


Sign in / Sign up

Export Citation Format

Share Document