Identification and Classification of Cyber Threats Through SSH Honeypot Systems

Author(s):  
José María Jorquera Valero ◽  
Manuel Gil Pérez ◽  
Alberto Huertas Celdrán ◽  
Gregorio Martínez Pérez

As the number and sophistication of cyber threats increases year after year, security systems such as antivirus, firewalls, or Intrusion Detection Systems based on misuse detection techniques are improved in detection capabilities. However, these traditional systems are usually limited to detect potential threats, since they are inadequate to spot zero-day attacks or mutations in behaviour. Authors propose using honeypot systems as a further security layer able to provide an intelligence holistic level in detecting unknown threats, or well-known attacks with new behaviour patterns. Since brute-force attacks are increasing in recent years, authors opted for an SSH medium-interaction honeypot to acquire a log set from attacker's interactions. The proposed system is able to acquire behaviour patterns of each attacker and link them with future sessions for early detection. Authors also generate a feature set to feed Machine Learning algorithms with the main goal of identifying and classifying attacker's sessions, and thus be able to learn malicious intentions in executing cyber threats.


Information ◽  
2020 ◽  
Vol 11 (6) ◽  
pp. 315
Author(s):  
Nathan Martindale ◽  
Muhammad Ismail ◽  
Douglas A. Talbert

As new cyberattacks are launched against systems and networks on a daily basis, the ability for network intrusion detection systems to operate efficiently in the big data era has become critically important, particularly as more low-power Internet-of-Things (IoT) devices enter the market. This has motivated research in applying machine learning algorithms that can operate on streams of data, trained online or “live” on only a small amount of data kept in memory at a time, as opposed to the more classical approaches that are trained solely offline on all of the data at once. In this context, one important concept from machine learning for improving detection performance is the idea of “ensembles”, where a collection of machine learning algorithms are combined to compensate for their individual limitations and produce an overall superior algorithm. Unfortunately, existing research lacks proper performance comparison between homogeneous and heterogeneous online ensembles. Hence, this paper investigates several homogeneous and heterogeneous ensembles, proposes three novel online heterogeneous ensembles for intrusion detection, and compares their performance accuracy, run-time complexity, and response to concept drifts. Out of the proposed novel online ensembles, the heterogeneous ensemble consisting of an adaptive random forest of Hoeffding Trees combined with a Hoeffding Adaptive Tree performed the best, by dealing with concept drift in the most effective way. While this scheme is less accurate than a larger size adaptive random forest, it offered a marginally better run-time, which is beneficial for online training.



Electronics ◽  
2020 ◽  
Vol 9 (12) ◽  
pp. 2006
Author(s):  
Malek Al-Zewairi ◽  
Sufyan Almajali ◽  
Moussa Ayyash

Advancements in machine learning and artificial intelligence have been widely utilised in the security domain, including but not limited to intrusion detection techniques. With the large training datasets of modern traffic, intelligent algorithms and powerful machine learning tools, security researchers have been able to greatly improve on the intrusion detection models and enhance their ability to detect malicious traffic more accurately. Nonetheless, the problem of detecting completely unknown security attacks is still an open area of research. The enormous number of newly developed attacks constitutes an eccentric challenge for all types of intrusion detection systems. Additionally, the lack of a standard definition of what constitutes an unknown security attack in the literature and the industry alike adds to the problem. In this paper, the researchers reviewed the studies on detecting unknown attacks over the past 10 years and found that they tended to use inconsistent definitions. This formulates the need for a standard consistent definition to have comparable results. The researchers proposed a new categorisation of two types of unknown attacks, namely Type-A, which represents a completely new category of unknown attacks, and Type-B, which represents unknown attacks within already known categories of attacks. The researchers conducted several experiments and evaluated modern intrusion detection systems based on shallow and deep artificial neural network models and their ability to detect Type-A and Type-B attacks using two well-known benchmark datasets for network intrusion detection. The research problem was studied as both a binary and multi-class classification problem. The results showed that the evaluated models had poor overall generalisation error measures, where the classification error rate in detecting several types of unknown attacks from 92 experiments was 50.09%, which highlights the need for new approaches and techniques to address this problem.



The network attacks become the most important security problems in the today’s world. There is a high increase in use of computers, mobiles, sensors,IoTs in networks, Big Data, Web Application/Server,Clouds and other computing resources. With the high increase in network traffic, hackers and malicious users are planning new ways of network intrusions. Many techniques have been developed to detect these intrusions which are based on data mining and machine learning methods. Machine learning algorithms intend to detect anomalies using supervised and unsupervised approaches.Both the detection techniques have been implemented using IDS datasets like DARPA98, KDDCUP99, NSL-KDD, ISCX, ISOT.UNSW-NB15 is the latest dataset. This data set contains nine different modern attack types and wide varieties of real normal activities. In this paper, a detailed survey of various machine learning based techniques applied on UNSW-NB15 data set have been carried out and suggested thatUNSW-NB15 is more complex than other datasets and is assumed as a new benchmark data set for evaluating NIDSs.



2020 ◽  
Vol 3 (2) ◽  
pp. 196-206
Author(s):  
Mausumi Das Nath ◽  
◽  
Tapalina Bhattasali

Due to the enormous usage of the Internet, users share resources and exchange voluminous amounts of data. This increases the high risk of data theft and other types of attacks. Network security plays a vital role in protecting the electronic exchange of data and attempts to avoid disruption concerning finances or disrupted services due to the unknown proliferations in the network. Many Intrusion Detection Systems (IDS) are commonly used to detect such unknown attacks and unauthorized access in a network. Many approaches have been put forward by the researchers which showed satisfactory results in intrusion detection systems significantly which ranged from various traditional approaches to Artificial Intelligence (AI) based approaches.AI based techniques have gained an edge over other statistical techniques in the research community due to its enormous benefits. Procedures can be designed to display behavior learned from previous experiences. Machine learning algorithms are used to analyze the abnormal instances in a particular network. Supervised learning is essential in terms of training and analyzing the abnormal behavior in a network. In this paper, we propose a model of Naïve Bayes and SVM (Support Vector Machine) to detect anomalies and an ensemble approach to solve the weaknesses and to remove the poor detection results



2021 ◽  
Vol 2021 ◽  
pp. 1-28
Author(s):  
Khalid M. Al-Gethami ◽  
Mousa T. Al-Akhras ◽  
Mohammed Alawairdhi

Optimizing the detection of intrusions is becoming more crucial due to the continuously rising rates and ferocity of cyber threats and attacks. One of the popular methods to optimize the accuracy of intrusion detection systems (IDSs) is by employing machine learning (ML) techniques. However, there are many factors that affect the accuracy of the ML-based IDSs. One of these factors is noise, which can be in the form of mislabelled instances, outliers, or extreme values. Determining the extent effect of noise helps to design and build more robust ML-based IDSs. This paper empirically examines the extent effect of noise on the accuracy of the ML-based IDSs by conducting a wide set of different experiments. The used ML algorithms are decision tree (DT), random forest (RF), support vector machine (SVM), artificial neural networks (ANNs), and Naïve Bayes (NB). In addition, the experiments are conducted on two widely used intrusion datasets, which are NSL-KDD and UNSW-NB15. Moreover, the paper also investigates the use of these ML algorithms as base classifiers with two ensembles of classifiers learning methods, which are bagging and boosting. The detailed results and findings are illustrated and discussed in this paper.



Sign in / Sign up

Export Citation Format

Share Document