scholarly journals Memory-Efficient Deep Learning for Botnet Attack Detection in IoT Networks

Electronics ◽  
2021 ◽  
Vol 10 (9) ◽  
pp. 1104
Author(s):  
Segun I. Popoola ◽  
Bamidele Adebisi ◽  
Ruth Ande ◽  
Mohammad Hammoudeh ◽  
Aderemi A. Atayero

Cyber attackers exploit a network of compromised computing devices, known as a botnet, to attack Internet-of-Things (IoT) networks. Recent research works have recommended the use of Deep Recurrent Neural Network (DRNN) for botnet attack detection in IoT networks. However, for high feature dimensionality in the training data, high network bandwidth and a large memory space will be needed to transmit and store the data, respectively in IoT back-end server or cloud platform for Deep Learning (DL). Furthermore, given highly imbalanced network traffic data, the DRNN model produces low classification performance in minority classes. In this paper, we exploit the joint advantages of Long Short-Term Memory Autoencoder (LAE), Synthetic Minority Oversampling Technique (SMOTE), and DRNN to develop a memory-efficient DL method, named LS-DRNN. The effectiveness of this method is evaluated with the Bot-IoT dataset. Results show that the LAE method reduced the dimensionality of network traffic features in the training set from 37 to 10, and this consequently reduced the memory space required for data storage by 86.49%. SMOTE method helped the LS-DRNN model to achieve high classification performance in minority classes, and the overall detection rate increased by 10.94%. Furthermore, the LS-DRNN model outperformed state-of-the-art models.

Sensors ◽  
2021 ◽  
Vol 21 (9) ◽  
pp. 2985
Author(s):  
Segun I. Popoola ◽  
Bamidele Adebisi ◽  
Ruth Ande ◽  
Mohammad Hammoudeh ◽  
Kelvin Anoh ◽  
...  

Nowadays, hackers take illegal advantage of distributed resources in a network of computing devices (i.e., botnet) to launch cyberattacks against the Internet of Things (IoT). Recently, diverse Machine Learning (ML) and Deep Learning (DL) methods were proposed to detect botnet attacks in IoT networks. However, highly imbalanced network traffic data in the training set often degrade the classification performance of state-of-the-art ML and DL models, especially in classes with relatively few samples. In this paper, we propose an efficient DL-based botnet attack detection algorithm that can handle highly imbalanced network traffic data. Specifically, Synthetic Minority Oversampling Technique (SMOTE) generates additional minority samples to achieve class balance, while Deep Recurrent Neural Network (DRNN) learns hierarchical feature representations from the balanced network traffic data to perform discriminative classification. We develop DRNN and SMOTE-DRNN models with the Bot-IoT dataset, and the simulation results show that high-class imbalance in the training data adversely affects the precision, recall, F1 score, area under the receiver operating characteristic curve (AUC), geometric mean (GM) and Matthews correlation coefficient (MCC) of the DRNN model. On the other hand, the SMOTE-DRNN model achieved better classification performance with 99.50% precision, 99.75% recall, 99.62% F1 score, 99.87% AUC, 99.74% GM and 99.62% MCC. Additionally, the SMOTE-DRNN model outperformed state-of-the-art ML and DL models.


Author(s):  
Zuleyha Yiner ◽  
Nurefsan Sertbas ◽  
Safak Durukan-Odabasi ◽  
Derya Yiltas-Kaplan

Cloud computing that aims to provide convenient, on-demand, network access to shared software and hardware resources has security as the greatest challenge. Data security is the main security concern followed by intrusion detection and prevention in cloud infrastructure. In this chapter, general information about cloud computing and its security issues are discussed. In order to prevent or avoid many attacks, a number of machine learning algorithms approaches are proposed. However, these approaches do not provide efficient results for identifying unknown types of attacks. Deep learning enables to learning features that are more complex, and thanks to the collection of big data as a training data, deep learning achieves more successful results. Many deep learning algorithms are proposed for attack detection. Deep networks architecture is divided into two categories, and descriptions for each architecture and its related attack detection studies are discussed in the following section of chapter.


2020 ◽  
Vol 39 (3) ◽  
pp. 4785-4801
Author(s):  
Cho Do Xuan ◽  
Mai Hoang Dao ◽  
Hoa Dinh Nguyen

Advanced Persistent Threat (APT) attacks are a form of malicious, intentionally and clearly targeted attack. This attack technique is growing in both the number of recorded attacks and the extent of its dangers to organizations, businesses and governments. Therefore, the task of detecting and warning APT attacks in the real system is very necessary today. One of the most effective approaches to APT attack detection is to apply machine learning or deep learning to analyze network traffic. There have been a number of studies and recommendations to analyze network traffic into network flows and then combine with some classification or clustering methods to look for signs of APT attacks. In particular, recent studies often apply machine learning algorithms to spot the present of APT attacks based on network flow. In this paper, a new method based on deep learning to detect APT attacks using network flow is proposed. Accordingly, in our research, network traffic is analyzed into IP-based network flows, then the IP information is reconstructed from flow, and finally deep learning models are used to extract features for detecting APT attack IPs from other IPs. Additionally, a combined deep learning model using Bidirectional Long Short-Term Memory (BiLSTM) and Graph Convolutional Networks (GCN) is introduced. The new detection model is evaluated and compared with some traditional machine learning models, i.e. Multi-layer perceptron (MLP) and single GCN models, in the experiments. Experimental results show that BiLSTM-GCN model has the best performance in all evaluation scores. This not only shows that deep learning application on flow network analysis to detect APT attacks is a good decision but also suggests a new direction for network intrusion detection techniques based on deep learning.


Healthcare ◽  
2020 ◽  
Vol 8 (3) ◽  
pp. 291 ◽  
Author(s):  
Chunwu Yin ◽  
Zhanbo Chen

Disease classification based on machine learning has become a crucial research topic in the fields of genetics and molecular biology. Generally, disease classification involves a supervised learning style; i.e., it requires a large number of labelled samples to achieve good classification performance. However, in the majority of the cases, labelled samples are hard to obtain, so the amount of training data are limited. However, many unclassified (unlabelled) sequences have been deposited in public databases, which may help the training procedure. This method is called semi-supervised learning and is very useful in many applications. Self-training can be implemented using high- to low-confidence samples to prevent noisy samples from affecting the robustness of semi-supervised learning in the training process. The deep forest method with the hyperparameter settings used in this paper can achieve excellent performance. Therefore, in this work, we propose a novel combined deep learning model and semi-supervised learning with self-training approach to improve the performance in disease classification, which utilizes unlabelled samples to update a mechanism designed to increase the number of high-confidence pseudo-labelled samples. The experimental results show that our proposed model can achieve good performance in disease classification and disease-causing gene identification.


2021 ◽  
pp. 84-95
Author(s):  
Leiqi Wang ◽  
Weiqing Huang ◽  
Qiujian Lv ◽  
Yan Wang ◽  
HaiYan Chen

Author(s):  
H. Yassine ◽  
K. Tout ◽  
M. Jaber

Abstract. Machine learning (ML) has proven useful for a very large number of applications in several domains. It has realized a remarkable growth in remote-sensing image analysis over the past few years. Deep Learning (DL) a subset of machine learning were applied in this work to achieve a better classification of Land Use Land Cover (LULC) in satellite imagery using Convolutional Neural Networks (CNNs). EuroSAT benchmarking data set is used as training data set which uses Sentinel-2 satellite images. Sentinel-2 provides images with 13 spectral feature bands, but surprisingly little attention has been paid to these features in deep learning models. The majority of applications focused only on using RGB due to high availability of the RGB models in computer vision. While RGB gives an accuracy of 96.83% using CNN, we are presenting two approaches to improve the classification performance of Sentinel-2 images. In the first approach, features are extracted from 13 spectral feature bands of Sentinel-2 instead of RGB which leads to accuracy of 98.78%. In the second approach features are extracted from 13 spectral bands of Sentinel-2 in addition to calculated indices used in LULC like Blue Ratio (BR), Vegetation index based on Red Edge (VIRE) and Normalized Near Infrared (NNIR), etc. which gives a better accuracy of 99.58%.


2019 ◽  
Vol 9 (12) ◽  
pp. 2550 ◽  
Author(s):  
Lim ◽  
Kim ◽  
Kim ◽  
Hong ◽  
Han

Recently, with the advent of various Internet of Things (IoT) applications, a massive amount of network traffic is being generated. A network operator must provide different quality of service, according to the service provided by each application. Toward this end, many studies have investigated how to classify various types of application network traffic accurately. Especially, since many applications use temporary or dynamic IP or Port numbers in the IoT environment, only payload-based network traffic classification technology is more suitable than the classification using the packet header information as well as payload. Furthermore, to automatically respond to various applications, it is necessary to classify traffic using deep learning without the network operator intervention. In this study, we propose a traffic classification scheme using a deep learning model in software defined networks. We generate flow-based payload datasets through our own network traffic pre-processing, and train two deep learning models: 1) the multi-layer long short-term memory (LSTM) model and 2) the combination of convolutional neural network and single-layer LSTM models, to perform network traffic classification. We also execute a model tuning procedure to find the optimal hyper-parameters of the two deep learning models. Lastly, we analyze the network traffic classification performance on the basis of the F1-score for the two deep learning models, and show the superiority of the multi-layer LSTM model for network packet classification.


2020 ◽  
pp. 377-394
Author(s):  
Zuleyha Yiner ◽  
Nurefsan Sertbas ◽  
Safak Durukan-Odabasi ◽  
Derya Yiltas-Kaplan

Cloud computing that aims to provide convenient, on-demand, network access to shared software and hardware resources has security as the greatest challenge. Data security is the main security concern followed by intrusion detection and prevention in cloud infrastructure. In this chapter, general information about cloud computing and its security issues are discussed. In order to prevent or avoid many attacks, a number of machine learning algorithms approaches are proposed. However, these approaches do not provide efficient results for identifying unknown types of attacks. Deep learning enables to learning features that are more complex, and thanks to the collection of big data as a training data, deep learning achieves more successful results. Many deep learning algorithms are proposed for attack detection. Deep networks architecture is divided into two categories, and descriptions for each architecture and its related attack detection studies are discussed in the following section of chapter.


2020 ◽  
Vol 12 (12) ◽  
pp. 2000 ◽  
Author(s):  
Chiman Kwan ◽  
Bulent Ayhan ◽  
Bence Budavari ◽  
Yan Lu ◽  
Daniel Perez ◽  
...  

There is an emerging interest in using hyperspectral data for land cover classification. The motivation behind using hyperspectral data is the notion that increasing the number of narrowband spectral channels would provide richer spectral information and thus help improve the land cover classification performance. Although hyperspectral data with hundreds of channels provide detailed spectral signatures, the curse of dimensionality might lead to degradation in the land cover classification performance. Moreover, in some practical applications, hyperspectral data may not be available due to cost, data storage, or bandwidth issues, and RGB and near infrared (NIR) could be the only image bands available for land cover classification. Light detection and ranging (LiDAR) data is another type of data to assist land cover classification especially if the land covers of interest have different heights. In this paper, we examined the performance of two Convolutional Neural Network (CNN)-based deep learning algorithms for land cover classification using only four bands (RGB+NIR) and five bands (RGB+NIR+LiDAR), where these limited number of image bands were augmented using Extended Multi-attribute Profiles (EMAP). The deep learning algorithms were applied to a well-known dataset used in the 2013 IEEE Geoscience and Remote Sensing Society (GRSS) Data Fusion Contest. With EMAP augmentation, the two deep learning algorithms were observed to achieve better land cover classification performance using only four bands as compared to that using all 144 hyperspectral bands.


2021 ◽  
Vol 2 (2) ◽  
pp. 1-25
Author(s):  
Stein Kristiansen ◽  
Konstantinos Nikolaidis ◽  
Thomas Plagemann ◽  
Vera Goebel ◽  
Gunn Marit Traaen ◽  
...  

Sleep apnea is a common and strongly under-diagnosed severe sleep-related respiratory disorder with periods of disrupted or reduced breathing during sleep. To diagnose sleep apnea, sleep data are collected with either polysomnography or polygraphy and scored by a sleep expert. We investigate in this work the use of supervised machine learning to automate the analysis of polygraphy data from the A3 study containing more than 7,400 hours of sleep monitoring data from 579 patients. We conduct a systematic comparative study of classification performance and resource use with different combinations of 27 classifiers and four sleep signals. The classifiers achieve up to 0.8941 accuracy (kappa: 0.7877) when using all four signal types simultaneously and up to 0.8543 accuracy (kappa: 0.7080) with only one signal, i.e., oxygen saturation. Methods based on deep learning outperform other methods by a large margin. All deep learning methods achieve nearly the same maximum classification performance even when they have very different architectures and sizes. When jointly accounting for classification performance, resource consumption and the ability to achieve with less training data high classification performance, we find that convolutional neural networks substantially outperform the other classifiers.


Sign in / Sign up

Export Citation Format

Share Document