A Look Behind the Curtain: Traffic Classification in an Increasingly Encrypted Web

Author(s):  
Iman Akbari ◽  
Mohammad A. Salahuddin ◽  
Leni Ven ◽  
Noura Limam ◽  
Raouf Boutaba ◽  
...  

Traffic classification is essential in network management for operations ranging from capacity planning, performance monitoring, volumetry, and resource provisioning, to anomaly detection and security. Recently, it has become increasingly challenging with the widespread adoption of encryption in the Internet, e.g., as a de-facto in HTTP/2 and QUIC protocols. In the current state of encrypted traffic classification using Deep Learning (DL), we identify fundamental issues in the way it is typically approached. For instance, although complex DL models with millions of parameters are being used, these models implement a relatively simple logic based on certain header fields of the TLS handshake, limiting model robustness to future versions of encrypted protocols. Furthermore, encrypted traffic is often treated as any other raw input for DL, while crucial domain-specific considerations exist that are commonly ignored. In this paper, we design a novel feature engineering approach that generalizes well for encrypted web protocols, and develop a neural network architecture based on Stacked Long Short-Term Memory (LSTM) layers and Convolutional Neural Networks (CNN) that works very well with our feature design. We evaluate our approach on a real-world traffic dataset from a major ISP and Mobile Network Operator. We achieve an accuracy of 95% in service classification with less raw traffic and smaller number of parameters, out-performing a state-of-the-art method by nearly 50% fewer false classifications. We show that our DL model generalizes for different classification objectives and encrypted web protocols. We also evaluate our approach on a public QUIC dataset with finer and application-level granularity in labeling, achieving an overall accuracy of 99%.

2021 ◽  
Vol 15 ◽  
Author(s):  
Mengmeng Ge ◽  
Xiangzhan Yu ◽  
Likun Liu

With the rapid popularization of robots, the risks brought by robot communication have also attracted the attention of researchers. Because current traffic classification methods based on plaintext cannot classify encrypted traffic, other methods based on statistical analysis require manual extraction of features. This paper proposes (i) a traffic classification framework based on a capsule neural network. This method has a multilayer neural network that can automatically learn the characteristics of the data stream. It uses capsule vectors instead of a single scalar input to effectively classify encrypted network traffic. (ii) For different network structures, a classification network structure combining convolution neural network and long short-term memory network is proposed. This structure has the characteristics of learning network traffic time and space characteristics. Experimental results show that the network model can classify encrypted traffic and does not require manual feature extraction. And on the basis of the previous tool, the recognition accuracy rate has increased by 8%


2021 ◽  
Vol 10 (4) ◽  
pp. 2181-2191
Author(s):  
Devi Munandar ◽  
Andri Fachrur Rozie ◽  
Andria Arisal

Sentiment analysis of short texts is challenging because of its limited context of information. It becomes more challenging to be done on limited resource language like Bahasa Indonesia. However, with various deep learning techniques, it can give pretty good accuracy. This paper explores several deep learning methods, such as multilayer perceptron (MLP), convolutional neural network (CNN), long short-term memory (LSTM), and builds combinations of those three architectures. The combinations of those three architectures are intended to get the best of those architecture models. The MLP accommodates the use of the previous model to obtain classification output. The CNN layer extracts the word feature vector from text sequences. Subsequently, the LSTM repetitively selects or discards feature sequences based on their context. Those advantages are useful for different domain datasets. The experiments on sentiment analysis of short text in Bahasa Indonesia show that hybrid models can obtain better performance, and the same architecture can be directly used in another domain-specific dataset.


2021 ◽  
Vol 2021 ◽  
pp. 1-16
Author(s):  
Taimur Bakhshi ◽  
Bogdan Ghita

An increasing number of Internet application services are relying on encrypted traffic to offer adequate consumer privacy. Anomaly detection in encrypted traffic to circumvent and mitigate cyber security threats is, however, an open and ongoing research challenge due to the limitation of existing traffic classification techniques. Deep learning is emerging as a promising paradigm, allowing reduction in manual determination of feature set to increase classification accuracy. The present work develops a deep learning-based model for detection of anomalies in encrypted network traffic. Three different publicly available datasets including the NSL-KDD, UNSW-NB15, and CIC-IDS-2017 are used to comprehensively analyze encrypted attacks targeting popular protocols. Instead of relying on a single deep learning model, multiple schemes using convolutional (CNN), long short-term memory (LSTM), and recurrent neural networks (RNNs) are investigated. Our results report a hybrid combination of convolutional (CNN) and gated recurrent unit (GRU) models as outperforming others. The hybrid approach benefits from the low-latency feature derivation of the CNN, and an overall improved training dataset fitting. Additionally, the highly effective generalization offered by GRU results in optimal time-domain-related feature extraction, resulting in the CNN and GRU hybrid scheme presenting the best model.


2021 ◽  
Vol 2021 ◽  
pp. 1-15
Author(s):  
Xinyi Hu ◽  
Chunxiang Gu ◽  
Fushan Wei

The development of the Internet has led to the complexity of network encrypted traffic. Identifying the specific classes of network encryption traffic is an important part of maintaining information security. The traditional traffic classification based on machine learning largely requires expert experience. As an end-to-end model, deep neural networks can minimize human intervention. This paper proposes the CLD-Net model, which can effectively distinguish network encrypted traffic. By segmenting and recombining the packet payload of the raw flow, it can automatically extract the features related to the packet payload, and by changing the expression of the packet interval, it integrates the packet interval information into the model. We use the ability of Convolutional Neural Network (CNN) to distinguish image classes, learn and classify the grayscale images that the raw flow has been preprocessed into, and then use the effectiveness of Long Short-Term Memory (LSTM) network on time series data to further enhance the model’s ability to classify. Finally, through feature reduction, the high-dimensional features learned by the neural network are converted into 8 dimensions to distinguish 8 different classes of network encrypted traffic. In order to verify the effectiveness of the CLD-Net model, we use the ISCX public dataset to conduct experiments. The results show that our proposed model can distinguish whether the unknown network traffic uses Virtual Private Network (VPN) with an accuracy of 98% and can accurately identify the specific traffic (chats, audio, or file) of Facebook and Skype applications with an accuracy of 92.89%.


Author(s):  
Salman Hussain Raza ◽  
Bibhya Nand Sharma ◽  
Kaylash Chaudhary

While the recent technological advancements have enabled instructors to deliver mathematical concepts and theories beyond the physical boundaries innovatively and interactively, poor performance and low success rate in mathematic courses have always been a major concern of educators. More specifically, in an online learning environment, where students are not physically present in the classroom and access course materials over the network, it is toilsome for course coordinators to track and monitor every student’s academic learning and experiences. Thus, automated student performance monitoring is indispensable since it is easy for online students, especially those underperforming, to be “out of sight,” hence getting derailed and off-track. Since student learning and performance are evolving over time, it is reasonable to consider student performance monitoring as a time-series problem and implement a time-series predictive model to forecast students’ educational progress and achievement. This research paper presents a case study from a higher education institute where interaction data and course achievement of a previously offered online course are used to develop a time-series predictive model using a Long Short-Term Memory network, a special kind of Recurrent Neural Network architecture. The proposed model makes predictions of student status at any given time of the semester by examining the trend or pattern learned in the previous events. The model reported an average classification accuracy of 86 and 84% with the training dataset and testing dataset, respectively. The proposed model is trialed on selected online math courses with exciting yet dissimilar trends recorded.


2018 ◽  
Author(s):  
Phanidra Palagummi ◽  
Vedant Somani ◽  
Krishna M. Sivalingam ◽  
Balaji Venkat

Networking connectivity is increasingly based on wireless network technologies, especially in developing nations where the wired network infrastructure is not accessible to a large segment of the population. Wireless data network technologies based on 2G and 3G are quite common globally; 4G-based deployments are on the rise during the past few years. At the same time, the increasing high-bandwidth and low-latency requirements of mobile applications has propelled the Third Generation Partnership Project (3GPP) standards organization to develop standards for the next generation of mobile networks, based on recent advances in wireless communication technologies. This standard is called the Fifth Generation (5G) wireless network standard. This paper presents a high-level overview of the important architectural components, of the advanced communication technologies, of the advanced networking technologies such as Network Function Virtualization and other important aspects that are part of the 5G network standards. The paper also describes some of the common future generation applications that require low-latency and high-bandwidth communications.


2021 ◽  
Vol 11 (4) ◽  
pp. 1829
Author(s):  
Davide Grande ◽  
Catherine A. Harris ◽  
Giles Thomas ◽  
Enrico Anderlini

Recurrent Neural Networks (RNNs) are increasingly being used for model identification, forecasting and control. When identifying physical models with unknown mathematical knowledge of the system, Nonlinear AutoRegressive models with eXogenous inputs (NARX) or Nonlinear AutoRegressive Moving-Average models with eXogenous inputs (NARMAX) methods are typically used. In the context of data-driven control, machine learning algorithms are proven to have comparable performances to advanced control techniques, but lack the properties of the traditional stability theory. This paper illustrates a method to prove a posteriori the stability of a generic neural network, showing its application to the state-of-the-art RNN architecture. The presented method relies on identifying the poles associated with the network designed starting from the input/output data. Providing a framework to guarantee the stability of any neural network architecture combined with the generalisability properties and applicability to different fields can significantly broaden their use in dynamic systems modelling and control.


Author(s):  
Giuseppe Aceto ◽  
Domenico Ciuonzo ◽  
Antonio Montieri ◽  
Antonio Pescapé

Sign in / Sign up

Export Citation Format

Share Document