Unsw-Nb15 Dataset and Machine Learning Based Intrusion Detection Systems

The network attacks become the most important security problems in the today’s world. There is a high increase in use of computers, mobiles, sensors,IoTs in networks, Big Data, Web Application/Server,Clouds and other computing resources. With the high increase in network traffic, hackers and malicious users are planning new ways of network intrusions. Many techniques have been developed to detect these intrusions which are based on data mining and machine learning methods. Machine learning algorithms intend to detect anomalies using supervised and unsupervised approaches.Both the detection techniques have been implemented using IDS datasets like DARPA98, KDDCUP99, NSL-KDD, ISCX, ISOT.UNSW-NB15 is the latest dataset. This data set contains nine different modern attack types and wide varieties of real normal activities. In this paper, a detailed survey of various machine learning based techniques applied on UNSW-NB15 data set have been carried out and suggested thatUNSW-NB15 is more complex than other datasets and is assumed as a new benchmark data set for evaluating NIDSs.

Download Full-text

Detection of Network Attacks using Machine Learning: A New Approach

International Journal for Research in Applied Science and Engineering Technology ◽

10.22214/ijraset.2021.39640 ◽

2021 ◽

Vol 9 (12) ◽

pp. 1881-1890

Author(s):

Avinash R. Sonule

Keyword(s):

Machine Learning ◽

Web Applications ◽

Cyber Attacks ◽

Machine Learning Algorithms ◽

Supervised Machine Learning ◽

Data Set ◽

Network Attacks ◽

New Approach ◽

Detection Techniques ◽

Network Intrusions

Abstract: The Cyber-attacks become the most important security problems in the today’s world. With the increase in use of computing resources connected to the Internet like computers, mobiles, sensors, IoTs in networks, Big Data, Web Applications/Server, Clouds and other computing resources, hackers and malicious users are planning new ways of network intrusions. Many techniques have been developed to detect these intrusions which are based on data mining and machine learning methods. These intrusions detection techniques have been applied on various IDS datasets. UNSW-NB15 is the latest dataset. This data set contains different modern attack types and wide varieties of real normal activities. In this paper, we compare Naïve Bays algorithm with proposed probability based supervised machine learning algorithms using reduced UNSW NB15 dataset. Keywords: UNSW NB-15, Machine Learning, Naïve Bayes, All to Single (AS) features probability Algorithm

Download Full-text

Identification and Classification of Cyber Threats Through SSH Honeypot Systems

Handbook of Research on Intrusion Detection Systems - Advances in Information Security, Privacy, and Ethics ◽

10.4018/978-1-7998-2242-4.ch006 ◽

2020 ◽

pp. 105-129

Author(s):

José María Jorquera Valero ◽

Manuel Gil Pérez ◽

Alberto Huertas Celdrán ◽

Gregorio Martínez Pérez

Keyword(s):

Machine Learning ◽

Early Detection ◽

Machine Learning Algorithms ◽

Intrusion Detection Systems ◽

Security Systems ◽

Detection Techniques ◽

Detection Systems ◽

Cyber Threats ◽

Detection Capabilities

As the number and sophistication of cyber threats increases year after year, security systems such as antivirus, firewalls, or Intrusion Detection Systems based on misuse detection techniques are improved in detection capabilities. However, these traditional systems are usually limited to detect potential threats, since they are inadequate to spot zero-day attacks or mutations in behaviour. Authors propose using honeypot systems as a further security layer able to provide an intelligence holistic level in detecting unknown threats, or well-known attacks with new behaviour patterns. Since brute-force attacks are increasing in recent years, authors opted for an SSH medium-interaction honeypot to acquire a log set from attacker's interactions. The proposed system is able to acquire behaviour patterns of each attacker and link them with future sessions for early detection. Authors also generate a feature set to feed Machine Learning algorithms with the main goal of identifying and classifying attacker's sessions, and thus be able to learn malicious intentions in executing cyber threats.

Download Full-text

Machine learning approach for automated defense against network intrusions

10.7287/peerj.preprints.27777 ◽

2019 ◽

Author(s):

Farhaan Noor Hamdani ◽

Farheen Siddiqui

Keyword(s):

Machine Learning ◽

Learning Algorithms ◽

False Positives ◽

Machine Learning Algorithms ◽

The Internet ◽

Data Traffic ◽

Network Resource ◽

Detection Systems ◽

Anomalous Data ◽

Network Intrusions

With the advent of the internet, there is a major concern regarding the growing number of attacks, where the attacker can target any computing or network resource remotely Also, the exponential shift towards the use of smart-end technology devices, results in various security related concerns, which include detection of anomalous data traffic on the internet. Unravelling legitimate traffic from malignant traffic is a complex task itself. Many attacks affect system resources thereby degenerating their computing performance. In this paper we propose a framework of supervised model implemented using machine learning algorithms which can enhance or aid the existing intrusion detection systems, for detection of variety of attacks. Here KDD (knowledge data and discovery) dataset is used as a benchmark. In accordance with detective abilities, we also analyze their performance, accuracy, alerts-logs and compute their overall detection rate. These machine learning algorithms are validated and tested in terms of accuracy, precision, true-false positives and negatives. Experimental results show that these methods are effective, generating low false positives and can be operative in building a defense line against network intrusions. Further, we compare these algorithms in terms of various functional parameters

Download Full-text

Machine learning approach for automated defense against network intrusions

10.7287/peerj.preprints.27777v1 ◽

2019 ◽

Author(s):

Farhaan Noor Hamdani ◽

Farheen Siddiqui

Keyword(s):

Machine Learning ◽

Learning Algorithms ◽

False Positives ◽

Machine Learning Algorithms ◽

The Internet ◽

Data Traffic ◽

Network Resource ◽

Detection Systems ◽

Anomalous Data ◽

Network Intrusions

Download Full-text

Modeling Realistic Adversarial Attacks against Network Intrusion Detection Systems

Digital Threats: Research and Practice ◽

10.1145/3469659 ◽

2021 ◽

Author(s):

Giovanni Apruzzese ◽

Mauro Andreolini ◽

Luca Ferretti ◽

Mirco Marchetti ◽

Michele Colajanni

Keyword(s):

Machine Learning ◽

Intrusion Detection ◽

Machine Learning Algorithms ◽

Intrusion Detection Systems ◽

Network Intrusion Detection ◽

Learning Methods ◽

Detection Systems ◽

Machine Learning Methods ◽

Network Intrusion ◽

Network Intrusion Detection Systems

The incremental diffusion of machine learning algorithms in supporting cybersecurity is creating novel defensive opportunities but also new types of risks. Multiple researches have shown that machine learning methods are vulnerable to adversarial attacks that create tiny perturbations aimed at decreasing the effectiveness of detecting threats. We observe that existing literature assumes threat models that are inappropriate for realistic cybersecurity scenarios because they consider opponents with complete knowledge about the cyber detector or that can freely interact with the target systems. By focusing on Network Intrusion Detection Systems based on machine learning methods, we identify and model the real capabilities and circumstances that are necessary for an attacker to carry out a feasible and successful adversarial attack. We then apply our model to several adversarial attacks proposed in literature and highlight the limits and merits that can result in actual adversarial attacks. The contributions of this paper can help hardening defensive systems by letting cyber defenders address the most critical and real issues, and can benefit researchers by allowing them to devise novel forms of adversarial attacks based on realistic threat models.

Download Full-text

Improving the performance of the intrusion detection systems by the machine learning explainability

International Journal of Web Information Systems ◽

10.1108/ijwis-03-2021-0022 ◽

2021 ◽

Vol ahead-of-print (ahead-of-print) ◽

Author(s):

Quang-Vinh Dang

Keyword(s):

Machine Learning ◽

Intrusion Detection ◽

State Of The Art ◽

Predictive Performance ◽

Machine Learning Algorithms ◽

Data Set ◽

Content Type ◽

Detection Systems ◽

Experimental Works ◽

The Relationship

Purpose This study aims to explain the state-of-the-art machine learning models that are used in the intrusion detection problem for human-being understandable and study the relationship between the explainability and the performance of the models. Design/methodology/approach The authors study a recent intrusion data set collected from real-world scenarios and use state-of-the-art machine learning algorithms to detect the intrusion. The authors apply several novel techniques to explain the models, then evaluate manually the explanation. The authors then compare the performance of model post- and prior-explainability-based feature selection. Findings The authors confirm our hypothesis above and claim that by forcing the explainability, the model becomes more robust, requires less computational power but achieves a better predictive performance. Originality/value The authors draw our conclusions based on their own research and experimental works.

Download Full-text

Prediction of aircraft estimated time of arrival using machine learning methods

The Aeronautical Journal ◽

10.1017/aer.2021.13 ◽

2021 ◽

pp. 1-15

Author(s):

O. Basturk ◽

C. Cetek

Keyword(s):

Machine Learning ◽

Web Application ◽

Absolute Error ◽

Machine Learning Algorithms ◽

Weather Data ◽

Time Of Arrival ◽

Learning Models ◽

Trajectory Data ◽

Different Sources ◽

Machine Learning Models

ABSTRACT In this study, prediction of aircraft Estimated Time of Arrival (ETA) is proposed using machine learning algorithms. Accurate prediction of ETA is important for management of delay and air traffic flow, runway assignment, gate assignment, collaborative decision making (CDM), coordination of ground personnel and equipment, and optimisation of arrival sequence etc. Machine learning is able to learn from experience and make predictions with weak assumptions or no assumptions at all. In the proposed approach, general flight information, trajectory data and weather data were obtained from different sources in various formats. Raw data were converted to tidy data and inserted into a relational database. To obtain the features for training the machine learning models, the data were explored, cleaned and transformed into convenient features. New features were also derived from the available data. Random forests and deep neural networks were used to train the machine learning models. Both models can predict the ETA with a mean absolute error (MAE) less than 6min after departure, and less than 3min after terminal manoeuvring area (TMA) entrance. Additionally, a web application was developed to dynamically predict the ETA using proposed models.

Download Full-text

An IoT-Focused Intrusion Detection System Approach Based on Preprocessing Characterization for Cybersecurity Datasets

Sensors ◽

10.3390/s21020656 ◽

2021 ◽

Vol 21 (2) ◽

pp. 656

Author(s):

Xavier Larriva-Novo ◽

Víctor A. Villagrá ◽

Mario Vega-Barbas ◽

Diego Rivera ◽

Mario Sanz Rodrigo

Keyword(s):

Machine Learning ◽

Intrusion Detection ◽

High Performance ◽

Learning Algorithm ◽

Detection System ◽

Machine Learning Algorithms ◽

Statistical Characteristics ◽

Detection Techniques ◽

Traffic Characteristics ◽

Benchmark Datasets

Security in IoT networks is currently mandatory, due to the high amount of data that has to be handled. These systems are vulnerable to several cybersecurity attacks, which are increasing in number and sophistication. Due to this reason, new intrusion detection techniques have to be developed, being as accurate as possible for these scenarios. Intrusion detection systems based on machine learning algorithms have already shown a high performance in terms of accuracy. This research proposes the study and evaluation of several preprocessing techniques based on traffic categorization for a machine learning neural network algorithm. This research uses for its evaluation two benchmark datasets, namely UGR16 and the UNSW-NB15, and one of the most used datasets, KDD99. The preprocessing techniques were evaluated in accordance with scalar and normalization functions. All of these preprocessing models were applied through different sets of characteristics based on a categorization composed by four groups of features: basic connection features, content characteristics, statistical characteristics and finally, a group which is composed by traffic-based features and connection direction-based traffic characteristics. The objective of this research is to evaluate this categorization by using various data preprocessing techniques to obtain the most accurate model. Our proposal shows that, by applying the categorization of network traffic and several preprocessing techniques, the accuracy can be enhanced by up to 45%. The preprocessing of a specific group of characteristics allows for greater accuracy, allowing the machine learning algorithm to correctly classify these parameters related to possible attacks.

Download Full-text

Systematic literature review of machine learning methods used in the analysis of real-world data for patient-provider decision making

BMC Medical Informatics and Decision Making ◽

10.1186/s12911-021-01403-2 ◽

2021 ◽

Vol 21 (1) ◽

Author(s):

Alan Brnabic ◽

Lisa M. Hess

Keyword(s):

Machine Learning ◽

Decision Making ◽

Literature Review ◽

Systematic Literature Review ◽

Real World ◽

Learning Algorithms ◽

External Validation ◽

Machine Learning Algorithms ◽

Learning Methods ◽

Machine Learning Methods

Abstract Background Machine learning is a broad term encompassing a number of methods that allow the investigator to learn from the data. These methods may permit large real-world databases to be more rapidly translated to applications to inform patient-provider decision making. Methods This systematic literature review was conducted to identify published observational research of employed machine learning to inform decision making at the patient-provider level. The search strategy was implemented and studies meeting eligibility criteria were evaluated by two independent reviewers. Relevant data related to study design, statistical methods and strengths and limitations were identified; study quality was assessed using a modified version of the Luo checklist. Results A total of 34 publications from January 2014 to September 2020 were identified and evaluated for this review. There were diverse methods, statistical packages and approaches used across identified studies. The most common methods included decision tree and random forest approaches. Most studies applied internal validation but only two conducted external validation. Most studies utilized one algorithm, and only eight studies applied multiple machine learning algorithms to the data. Seven items on the Luo checklist failed to be met by more than 50% of published studies. Conclusions A wide variety of approaches, algorithms, statistical software, and validation strategies were employed in the application of machine learning methods to inform patient-provider decision making. There is a need to ensure that multiple machine learning approaches are used, the model selection strategy is clearly defined, and both internal and external validation are necessary to be sure that decisions for patient care are being made with the highest quality evidence. Future work should routinely employ ensemble methods incorporating multiple machine learning algorithms.

Download Full-text

PredNTS: Improved and Robust Prediction of Nitrotyrosine Sites by Integrating Multiple Sequence Features

International Journal of Molecular Sciences ◽

10.3390/ijms22052704 ◽

2021 ◽

Vol 22 (5) ◽

pp. 2704

Author(s):

Andi Nur Nilamyani ◽

Firda Nurul Auliah ◽

Mohammad Ali Moni ◽

Watshara Shoombuatong ◽

Md Mehedi Hasan ◽

...

Keyword(s):

Machine Learning ◽

Random Forest ◽

Web Application ◽

Computational Prediction ◽

Vital Role ◽

Machine Learning Algorithms ◽

Recursive Feature Elimination ◽

Post Translational Modification ◽

Multiple Sequence ◽

Sequence Features

Nitrotyrosine, which is generated by numerous reactive nitrogen species, is a type of protein post-translational modification. Identification of site-specific nitration modification on tyrosine is a prerequisite to understanding the molecular function of nitrated proteins. Thanks to the progress of machine learning, computational prediction can play a vital role before the biological experimentation. Herein, we developed a computational predictor PredNTS by integrating multiple sequence features including K-mer, composition of k-spaced amino acid pairs (CKSAAP), AAindex, and binary encoding schemes. The important features were selected by the recursive feature elimination approach using a random forest classifier. Finally, we linearly combined the successive random forest (RF) probability scores generated by the different, single encoding-employing RF models. The resultant PredNTS predictor achieved an area under a curve (AUC) of 0.910 using five-fold cross validation. It outperformed the existing predictors on a comprehensive and independent dataset. Furthermore, we investigated several machine learning algorithms to demonstrate the superiority of the employed RF algorithm. The PredNTS is a useful computational resource for the prediction of nitrotyrosine sites. The web-application with the curated datasets of the PredNTS is publicly available.

Download Full-text