scholarly journals UGRansome1819: A Novel Dataset for Anomaly Detection and Zero-Day Threats

Information ◽  
2021 ◽  
Vol 12 (10) ◽  
pp. 405
Author(s):  
Mike Nkongolo ◽  
Jacobus Philippus van Deventer ◽  
Sydney Mambwe Kasongo

This research attempts to introduce the production methodology of an anomaly detection dataset using ten desirable requirements. Subsequently, the article presents the produced dataset named UGRansome, created with up-to-date and modern network traffic (netflow), which represents cyclostationary patterns of normal and abnormal classes of threatening behaviours. It was discovered that the timestamp of various network attacks is inferior to one minute and this feature pattern was used to record the time taken by the threat to infiltrate a network node. The main asset of the proposed dataset is its implication in the detection of zero-day attacks and anomalies that have not been explored before and cannot be recognised by known threats signatures. For instance, the UDP Scan attack has been found to utilise the lowest netflow in the corpus, while the Razy utilises the highest one. In turn, the EDA2 and Globe malware are the most abnormal zero-day threats in the proposed dataset. These feature patterns are included in the corpus, but derived from two well-known datasets, namely, UGR’16 and ransomware that include real-life instances. The former incorporates cyclostationary patterns while the latter includes ransomware features. The UGRansome dataset was tested with cross-validation and compared to the KDD99 and NSL-KDD datasets to assess the performance of Ensemble Learning algorithms. False alarms have been minimized with a null empirical error during the experiment, which demonstrates that implementing the Random Forest algorithm applied to UGRansome can facilitate accurate results to enhance zero-day threats detection. Additionally, most zero-day threats such as Razy, Globe, EDA2, and TowerWeb are recognised as advanced persistent threats that are cyclostationary in nature and it is predicted that they will be using spamming and phishing for intrusion. Lastly, achieving the UGRansome balance was found to be NP-Hard due to real life-threatening classes that do not have a uniform distribution in terms of several instances.

Electronics ◽  
2021 ◽  
Vol 10 (15) ◽  
pp. 1747
Author(s):  
Niraj Thapa ◽  
Zhipeng Liu ◽  
Addison Shaver ◽  
Albert Esterline ◽  
Balakrishna Gokaraju ◽  
...  

Anomaly detection and multi-attack classification are major concerns for cyber defense. Several publicly available datasets have been used extensively for the evaluation of Intrusion Detection Systems (IDSs). However, most of the publicly available datasets may not contain attack scenarios based on evolving threats. The development of a robust network intrusion dataset is vital for network threat analysis and mitigation. Proactive IDSs are required to tackle ever-growing threats in cyberspace. Machine learning (ML) and deep learning (DL) models have been deployed recently to detect the various types of cyber-attacks. However, current IDSs struggle to attain both a high detection rate and a low false alarm rate. To address these issues, we first develop a Center for Cyber Defense (CCD)-IDSv1 labeled flow-based dataset in an OpenStack environment. Five different attacks with normal usage imitating real-life usage are implemented. The number of network features is increased to overcome the shortcomings of the previous network flow-based datasets such as CIDDS and CIC-IDS2017. Secondly, this paper presents a comparative analysis on the effectiveness of different ML and DL models on our CCD-IDSv1 dataset. In this study, we consider both cyber anomaly detection and multi-attack classification. To improve the performance, we developed two DL-based ensemble models: Ensemble-CNN-10 and Ensemble-CNN-LSTM. Ensemble-CNN-10 combines 10 CNN models developed from 10-fold cross-validation, whereas Ensemble-CNN-LSTM combines base CNN and LSTM models. This paper also presents feature importance for both anomaly detection and multi-attack classification. Overall, the proposed ensemble models performed well in both the 10-fold cross-validation and independent testing on our dataset. Together, these results suggest the robustness and effectiveness of the proposed IDSs based on ML and DL models on the CCD-IDSv1 intrusion detection dataset.


Author(s):  
Juma Ibrahim ◽  
Slavko Gajin

Entropy-based network traffic anomaly detection techniques are attractive due to their simplicity and applicability in a real-time network environment. Even though flow data provide only a basic set of information about network communications, they are suitable for efficient entropy-based anomaly detection techniques. However, a recent work reported a serious weakness of the general entropy-based anomaly detection related to its susceptibility to deception by adding spoofed data that camouflage the anomaly. Moreover, techniques for further classification of the anomalies mostly rely on machine learning, which involves additional complexity. We address these issues by providing two novel approaches. Firstly, we propose an efficient protection mechanism against entropy deception, which is based on the analysis of changes in different entropy types, namely Shannon, R?nyi, and Tsallis entropies, and monitoring the number of distinct elements in a feature distribution as a new detection metric. The proposed approach makes the entropy techniques more reliable. Secondly, we have extended the existing entropy-based anomaly detection approach with the anomaly classification method. Based on a multivariate analysis of the entropy changes of multiple features as well as aggregation by complex feature combinations, entropy-based anomaly classification rules were proposed and successfully verified through experiments. Experimental results are provided to validate the feasibility of the proposed approach for practical implementation of efficient anomaly detection and classification method in the general real-life network environment.


Author(s):  
Rupam Mukherjee

For prognostics in industrial applications, the degree of anomaly of a test point from a baseline cluster is estimated using a statistical distance metric. Among different statistical distance metrics, energy distance is an interesting concept based on Newton’s Law of Gravitation, promising simpler computation than classical distance metrics. In this paper, we review the state of the art formulations of energy distance and point out several reasons why they are not directly applicable to the anomaly-detection problem. Thereby, we propose a new energy-based metric called the P-statistic which addresses these issues, is applicable to anomaly detection and retains the computational simplicity of the energy distance. We also demonstrate its effectiveness on a real-life data-set.


2019 ◽  
Author(s):  
Lisa Kroll ◽  
Nikolaus Böhning ◽  
Heidi Müßigbrodt ◽  
Maria Stahl ◽  
Pavel Halkin ◽  
...  

BACKGROUND Agitation is common in geriatric patients with dementia (PWD) admitted to an emergency department (ED) and is associated with a higher risk of an unfavourable clinical course. Hence, monitoring of vital signs and enhanced movement is essential in these patients during their stay in the ED. Since PWD rarely tolerate fixed monitoring devices, non-contact monitoring systems might represent appropriate alternatives. OBJECTIVE To study the reliability of a non-contact monitoring system (NCMSys) and of a tent-like device (“Charité Dome”, ChD), aimed to shelter PWD from the busy ED-environment. Further, effects of the ChD on wellbeing and agitation of PWD will be measured. METHODS Both devices were attached to patient’s bed. Tests on technical reliability and other safety issues of the NCMSys and the ChD were performed at the iDoc-institute. A feasibility study evaluating the reliability of the NCMSys with and without the ChD was performed in the real-life setting of an ED and on a geriatric-gerontopsychiatric ward. Technical reliability and other safety issues were tested with six healthy volunteers. For the feasibility study 19 patients were included (ten males and nine females; mean age: 77.4 (55-93) years of which 14 were PWD. PWD inclusion criteria were age ≥55 years, a dementia diagnosis as well as a written consent (by patients themselves or by a custodian). Exclusion criteria were acute life-threatening situations and a missing consent. RESULTS Heart rate, changes in movement and sound emissions were measured reliably by the NCMSys, whereas patient movements affected respiratory rate measurements. The ChD did not impact patients’ vital signs or movements in our study setting. However, 53% of the PWD (7/13) and most of the patients without dementia (4/5) benefited from its use regarding their agitation and overall wellbeing. CONCLUSIONS NCMSys and ChD work reliably in the clinical setting and have positive effects on agitation and wellbeing. The results of this feasibility study encourages prospective studies with longer durations to further evaluate this concept for monitoring and prevention of agitation in PWD in the ED. CLINICALTRIAL ICTRP: “Charité-Dome-Study - DRKS00014737”


2021 ◽  
Vol 6 (2) ◽  
pp. 295-302
Author(s):  
Adelaiye Oluwasegun Ishaya ◽  
Ajibola Aminat ◽  
Bisallah Hashim ◽  
Abiona Akeem Adekunle

Author(s):  
Hesham M. Al-Ammal

Detection of anomalies in a given data set is a vital step in several applications in cybersecurity; including intrusion detection, fraud, and social network analysis. Many of these techniques detect anomalies by examining graph-based data. Analyzing graphs makes it possible to capture relationships, communities, as well as anomalies. The advantage of using graphs is that many real-life situations can be easily modeled by a graph that captures their structure and inter-dependencies. Although anomaly detection in graphs dates back to the 1990s, recent advances in research utilized machine learning methods for anomaly detection over graphs. This chapter will concentrate on static graphs (both labeled and unlabeled), and the chapter summarizes some of these recent studies in machine learning for anomaly detection in graphs. This includes methods such as support vector machines, neural networks, generative neural networks, and deep learning methods. The chapter will reflect the success and challenges of using these methods in the context of graph-based anomaly detection.


2019 ◽  
pp. 249-264
Author(s):  
Ruth MacDonald

Presented as a variation of Homer’s Odyssey rewritten from the perspective of a cancer patient’s spouse, Gwyneth Lewis’s A Hospital Odyssey (2010) has its roots in her husband’s real-life diagnosis with non-Hodgkin lymphoma. In thus reconfiguring the hero’s homecoming as a wife’s quest to find and cure her sick husband, the poet is able to chart her journey in coming to terms with her husband’s condition and treatment. Although the voyage towards healing and marital reconciliation is at times difficult and fraught with danger, the poem’s protagonist is buoyed along her way through a series of affirming encounters with other characters. Reading the text alongside Julia Kristeva’s concept of the abject, the chapter considers the ways in which Lewis interrogates contemporary attitudes towards the sick, as well as what it means to care for someone diagnosed with a life-threatening illness.


Author(s):  
Jörg Piper ◽  
Birgit Müller

Technical concepts of a multi-parameter-based system are described which can be used for continuous ambulatory monitoring of several vital signs. When critical or fatal events are detected, an automatic alarm is generated including information about the patient´s position (global positioning system, GPS) and additional messages. A lot of vital parameters are continuously monitored by “bio detectors” which are connected with a mobile data acquisition system carried by the patient. This data acquisition system interacts with a mobile phone so that an alarm can immediately be sounded in cases of critical or fatal events. Other episodes relevant for the patient´s long-term prognosis without leading to life-threatening outcomes can be stored for elective analyses without generating an alarm. Moreover, patients can manually give an alarm on demand. Potential false alarms can be manually canceled. In further stages of development these technical components could interact with electronic control systems of cars so that cars could be immediately stopped if the driver becomes unconscious.


2020 ◽  
Vol 2020 ◽  
pp. 1-10
Author(s):  
Mingzhu Tang ◽  
Xiangwan Fu ◽  
Huawei Wu ◽  
Qi Huang ◽  
Qi Zhao

Traffic flow anomaly detection is helpful to improve the efficiency and reliability of detecting fault behavior and the overall effectiveness of the traffic operation. The data detected by the traffic flow sensor contains a lot of noise due to equipment failure, environmental interference, and other factors. In the case of large traffic flow data noises, a traffic flow anomaly detection method based on robust ridge regression with particle swarm optimization (PSO) algorithm is proposed. Feature sets containing historical characteristics with a strong linear correlation and statistical characteristics using the optimal sliding window are constructed. Then by providing the feature sets inputs to the PSO-Huber-Ridge model and the model outputs the traffic flow. The Huber loss function is recommended to reduce noise interference in the traffic flow. The L2 regular term of the ridge regression is employed to reduce the degree of overfitting of the model training. A fitness function is constructed, which can balance the relative size between the k-fold cross-validation root mean square error and the k-fold cross-validation average absolute error with the control parameter η to improve the optimization efficiency of the optimization algorithm and the generalization ability of the proposed model. The hyperparameters of the robust ridge regression forecast model are optimized by the PSO algorithm to obtain the optimal hyperparameters. The traffic flow data set is used to train and validate the proposed model. Compared with other optimization methods, the proposed model has the lowest RMSE, MAE, and MAPE. Finally, the traffic flow that forecasted by the proposed model is used to perform anomaly detection. The abnormality of the error between the forecasted value and the actual value is detected by the abnormal traffic flow threshold based on the sliding window. The experimental results verify the validity of the proposed anomaly detection model.


Sign in / Sign up

Export Citation Format

Share Document