An analysis of clustering objectives for feature selection applied to encrypted traffic identification

With the rapid growth of the encrypted network traffic, the identification to it becomes a hot topic in information security. Since the existing methods have difficulties in identifying the application which the encrypted traffic belongs to, a new encrypted traffic identification scheme is proposed in this paper. The proposed scheme has two levels. In the first level, the entropy and estimation of Monte Carlo π value as features are used to identify the encrypted traffic by C4.5 decision tree. In the second level, the application types are distinguished from the encrypted traffic selected above. First, the variational automatic encoder is used to extract the layer features, which is combined with the frequently-used stream features. Meanwhile, the mutual information is used to reduce the dimensionality of the combination features. Finally, the random forest classifier is used to obtain the optimal result. Compared with the existing methods, the experimental results show that the proposed scheme not only has faster convergence speed but also achieves better performance in the recognition accuracy, recall rate, and F1-Measure, which is higher than 97%.

Download Full-text

Generalization of signatures for SSH encrypted traffic identification

2009 IEEE Symposium on Computational Intelligence in Cyber Security ◽

10.1109/cicybs.2009.4925105 ◽

2009 ◽

Cited By ~ 18

Author(s):

Riyad Alshammari ◽

Nur Zincir-Heywood

Keyword(s):

Traffic Identification ◽

Encrypted Traffic

Download Full-text

Multi-stage Feature Selection for On-Line Flow Peer-to-Peer Traffic Identification

Communications in Computer and Information Science - Modeling, Design and Simulation of Systems ◽

10.1007/978-981-10-6502-6_44 ◽

2017 ◽

pp. 509-523

Author(s):

Bushra Mohammed Ali Abdalla ◽

Haitham A. Jamil ◽

Mosab Hamdan ◽

Joseph Stephen Bassi ◽

Ismahani Ismail ◽

...

Keyword(s):

Feature Selection ◽

Peer To Peer ◽

Traffic Identification ◽

Multi Stage ◽

Line Flow ◽

On Line ◽

Selection For

Download Full-text

A Novel Peer to Peer Traffic Identification Approach based on Hybrid Feature Selection Algorithm

International Journal of Digital Content Technology and its Applications ◽

10.4156/jdcta.vol6.issue8.24 ◽

2012 ◽

Vol 6 (8) ◽

pp. 204-212 ◽

Cited By ~ 1

Author(s):

Zhenling Wang

Keyword(s):

Feature Selection ◽

Peer To Peer ◽

Selection Algorithm ◽

Feature Selection Algorithm ◽

Traffic Identification ◽

Identification Approach

Download Full-text

A Method for TLS Malicious Traffic Identification Based on Machine Learning

Advances in Science and Technology ◽

10.4028/www.scientific.net/ast.105.291 ◽

2021 ◽

Vol 105 ◽

pp. 291-301

Author(s):

Wei Wang ◽

Cheng Sheng Sun ◽

Jia Ning Ye

Keyword(s):

Feature Extraction ◽

Relevant Information ◽

Extraction Process ◽

Identification Accuracy ◽

Security And Privacy ◽

Feature Extraction Method ◽

Traffic Identification ◽

Work Related ◽

Encrypted Traffic ◽

Network Security Management

With more and more malicious traffic using TLS protocol encryption, efficient identification of TLS malicious traffic has become an increasingly important task in network security management in order to ensure communication security and privacy. Most of the traditional traffic identification methods on TLS malicious encryption only adopt the common characteristics of ordinary traffic, which results in the increase of coupling among features and then the low identification accuracy. In addition, most of the previous work related to malicious traffic identification extracted features directly from the data flow without recording the extraction process, making it difficult for subsequent traceability. Therefore, this paper implements an efficient feature extraction method with structural correlation for TLS malicious encrypted traffic. The traffic feature extraction process is logged in modules, and the index is used to establish relevant information links, so as to analyse the context and facilitate subsequent feature analysis and problem traceability. Finally, Random Forest is used to realize efficient TLS malicious traffic identification with an accuracy of up to 99.38%.

Download Full-text