scholarly journals CBD: A Deep-Learning-Based Scheme for Encrypted Traffic Classification with a General Pre-Training Method

Sensors ◽  
2021 ◽  
Vol 21 (24) ◽  
pp. 8231
Author(s):  
Xinyi Hu ◽  
Chunxiang Gu ◽  
Yihang Chen ◽  
Fushan Wei

With the rapid increase in encrypted traffic in the network environment and the increasing proportion of encrypted traffic, the study of encrypted traffic classification has become increasingly important as a part of traffic analysis. At present, in a closed environment, the classification of encrypted traffic has been fully studied, but these classification models are often only for labeled data and difficult to apply in real environments. To solve these problems, we propose a transferable model called CBD with generalization abilities for encrypted traffic classification in real environments. The overall structure of CBD can be generally described as a of one-dimension CNN and the encoder of Transformer. The model can be pre-trained with unlabeled data to understand the basic characteristics of encrypted traffic data, and be transferred to other datasets to complete the classification of encrypted traffic from the packet level and the flow level. The performance of the proposed model was evaluated on a public dataset. The results showed that the performance of the CBD model was better than the baseline methods, and the pre-training method can improve the classification ability of the model.

2016 ◽  
Vol 27 (1) ◽  
pp. 312-319 ◽  
Author(s):  
Guy Cafri ◽  
Juanjuan Fan

In many medical applications involving observational survival data there will be a cross-classification of doctors and hospitals, as well as an interest in controlling for potentially confounding doctor and hospital effects when evaluating the effectiveness of a medical intervention. In this paper, we propose the use of a between-within model with cross-classified random effects and show through simulation that it performs better than alternative models. A real data example illustrates the application of the proposed model in a study of the survival of hip implants. The proposed model has broad utility in determining the effectiveness of medical interventions.


2021 ◽  
Vol 2021 ◽  
pp. 1-11
Author(s):  
Junkai Yi ◽  
Guanglin Gong ◽  
Zeyu Liu ◽  
Yacong Zhang

In order to solve the problem that traditional analysis approaches of encrypted traffic in encryption transmission of network application only consider the traffic classification in the complete communication process with ignoring traffic classification in the simplified communication process, and there are a lot of duplication problems in application fingerprints during state transition, a new classification approach of encrypted traffic is proposed. The article applies the Gaussian mixture model (GMM) to analyze the length of the message, and the model is established to solve the problem of application fingerprint duplication. The fingerprints with similar lengths of the same application are divided into as few clusters as possible by constrained clustering approach, which speeds up convergence speed and improves the clustering effect. The experimental results show that compared with the other encryption traffic classification approaches, the proposed approach has 11.7%, 19.8%, 6.86%, and 5.36% improvement in TPR, FPR, Precision, and Recall, respectively, and the classification effect of encrypted traffic is significantly improved.


Author(s):  
Ubaid Illahi ◽  
Mohammad Shafi Mir

Classification of vehicles in the traffic stream is a pre-requisite for planning and designing the facilities for road-users. Considering the importance and gaining popularity of automated systems in this field, the aim of this article is to compare two algorithms- one using the Background Subtraction (BS) technique and the other using Convolutional Neural Network (CNN) with a primary focus on an increased number of vehicle classifications. To check the reliability of these algorithms, the outputs produced were validated against the data obtained from Kachkoot Toll Plaza, India. The results were analyzed using drop-line diagrams and confusion matrices. The overall efficiency of the CNN-based algorithm (0.98) was found to be better than the BS-based algorithm (0.95). The comparison presented in this paper will be useful for transportation professionals and agencies.


Author(s):  
Liwen Peng ◽  
Yongguo Liu

The past decade has witnessed the growing popularity in multi-label classification algorithms in the fields like text categorization, music information retrieval, and the classification of videos and medical proteins. In the meantime, the methods based on the principle of universal gravitation have been extensively used in the classification of machine learning owing to simplicity and high performance. In light of the above, this paper proposes a novel multi-label classification algorithm called the interaction and data gravitation-based model for multi-label classification (ITDGM). The algorithm replaces the interaction between two objects with the attraction between two particles. The author carries out a series of experiments on five multi-label datasets. The experimental results show that the ITDGM performs better than some well-known multi-label classification algorithms. The effect of the proposed model is assessed by the example-based F1-Measure and Label-based micro F1-measure.


Classification network traffic are becoming ever more relevant in understanding and addressing security issues inInternet applications. Virtual Private Networks (VPNs) have become one famous communication forms on the Internet. In this study, a new model for traffic classification into VPN or non-VPN is proposed. XGBoost algorithm is used to rank features and to build the classification model. The proposed model overwhelmed other classification algorithms. The proposed model achieved 91.6% accuracy which is the highest registered accuracy for the selected dataset. To illustrate the merit of the proposed model, a comparison was made with sixteen different classification algorithms


Symmetry ◽  
2021 ◽  
Vol 13 (6) ◽  
pp. 1080
Author(s):  
Bei Lu ◽  
Nurbol Luktarhan ◽  
Chao Ding ◽  
Wenhui Zhang

The wide application of encryption technology has made traffic classification gradually become a major challenge in the field of network security. Traditional methods such as machine learning, which rely heavily on feature engineering and others, can no longer fully meet the needs of encrypted traffic classification. Therefore, we propose an Inception-LSTM(ICLSTM) traffic classification method in this paper to achieve encrypted traffic service identification. This method converts traffic data into common gray images, and then uses the constructed ICLSTM neural network to extract key features and perform effective traffic classification. To alleviate the problem of category imbalance, different weight parameters are set for each category separately in the training phase to make it more symmetrical for different categories of encrypted traffic, and the identification effect is more balanced and reasonable. The method is validated on the public ISCX 2016 dataset, and the results of five classification experiments show that the accuracy of the method exceeds 98% for both regular encrypted traffic service identification and VPN encrypted traffic service identification. At the same time, this deep learning-based classification method also greatly simplifies the difficulty of traffic feature extraction work.


Author(s):  
Virender Ranga ◽  
Shivam Gupta ◽  
Priyansh Agrawal ◽  
Jyoti Meena

Introduction: The major area of work of pathologists is concerned with detecting the diseases and helping the patients in their healthcare and well-being. The present method used by pathologists for this purpose is manually viewing the slides using a microscope and other instruments. But this method suffers from a lot of problems, like there is no standard way of diagnosing, human errors and it puts a heavy load on the laboratory men to diagnose such a large number of slides daily. Method: The slide viewing method is widely used and converted into digital form to produce high resolution images. This enables the area of deep learning and machine learning to deep dive into this field of medical sciences. In the present study, a neural based network has been proposed for classification of blood cells images into various categories. When input image is passed through the proposed architecture and all the hyper parameters and dropout ratio values are used in accordance with proposed algorithm, then model classifies the blood images with an accuracy of 95.24%. Result: After training the models on 20 epochs. The plots of training accuracy, testing accuracy and corresponding training loss, testing loss for proposed model is plotted using matplotlib and trends. Discussion: The performance of proposed model is better than existing standard architectures and other work done by various researchers. Thus, the proposed model enables the development of pathological system which will reduce human errors and daily load on laboratory men. This can also in turn help pathologists in carrying out their work more efficiently and effectively. Conclusion: In the present study, a neural based network has been proposed for classification of blood cells images into various categories. These categories have significance in the medical sciences. When input image is passed through the proposed architecture and all the hyper parameters and dropout ratio values are used in accordance with proposed algorithm, then model classifies the images with an accuracy of 95.24%. This accuracy is better than standard architectures.. Further it can be seen that the proposed neural network performs better than present related works carried by various researchers.


2021 ◽  
Vol 2021 ◽  
pp. 1-11
Author(s):  
Yan Li ◽  
Yifei Lu

Due to the increasing variety of encryption protocols and services in the network, the characteristics of the application are very different under different protocols. However, there are very few existing studies on encrypted application classification considering the type of encryption protocols. In order to achieve the refined classification of encrypted applications, this paper proposes an Encrypted Two-Label Classification using CNN (ETCC) method, which can identify both the protocols and the applications. ETCC is a two-stage two-label classification method. The first stage classifies the protocol used for encrypted traffic. The second stage uses the corresponding classifier to classify applications according to the protocol used by the traffic. Experimental results show that the ETCC achieves 97.65% accuracy on a public dataset (CICDarknet2020).


Sensors ◽  
2019 ◽  
Vol 19 (3) ◽  
pp. 480 ◽  
Author(s):  
César Gil ◽  
Javier Parra-Arnau

The Internet, with the rise of the IoT, is one of the most powerful means of propagating a terrorist threat, and at the same time the perfect environment for deploying ubiquitous online surveillance systems.This paper tackles the problem of online surveillance, which we define as the monitoring by a security agency of a set of websites through tracking and classification of profiles that are potentially suspected of carrying out terrorist attacks. We conduct a theoretical analysis in this scenario that investigates the introduction of automatic classification technology compared to the status quo involving manual investigation of the collected profiles. Our analysis starts examining the suitability of game-theoretic-based models for decision-making in the introduction of this technology. We propose an adversarial-risk-analysis (ARA) model as a novel way of approaching the online surveillance problem that has the advantage of discarding the hypothesis of common knowledge. The proposed model allows us to study the rationality conditions of the automatic suspect detection technology, determining under which circumstances it is better than the traditional human-based approach. Our experimental results show the benefits of the proposed model. Compared to standard game theory, our ARA-based model indicates in general greater prudence in the deployment of the automatic technology and exhibits satisfactory performance without having to relax crucial hypotheses such as common knowledge and therefore subtracting realism from the problem, although at the expense of higher computational complexity.


Sign in / Sign up

Export Citation Format

Share Document