Loading…

Semi-2DCAE: a semi-supervision 2D-CNN AutoEncoder model for feature representation and classification of encrypted traffic

Traffic classification is essential in network-related areas such as network management, monitoring, and security. As the proportion of encrypted internet traffic rises, the accuracy of port-based and DPI-based traffic classification methods has declined. The methods based on machine learning and de...

Full description

Saved in:
Bibliographic Details
Published in:PeerJ. Computer science 2023-11, Vol.9, p.e1635-e1635, Article e1635
Main Authors: Cui, Jun, Bai, Longkun, Li, Guangxu, Lin, Zhigui, Zeng, Penggao
Format: Article
Language:English
Subjects:
Citations: Items that this one cites
Items that cite this one
Online Access:Get full text
Tags: Add Tag
No Tags, Be the first to tag this record!
Description
Summary:Traffic classification is essential in network-related areas such as network management, monitoring, and security. As the proportion of encrypted internet traffic rises, the accuracy of port-based and DPI-based traffic classification methods has declined. The methods based on machine learning and deep learning have effectively improved the accuracy of traffic classification, but they still suffer from inadequate extraction of traffic structure features and poor feature representativeness. This article proposes a model called Semi-supervision 2-Dimensional Convolution AutoEncoder (Semi-2DCAE). The model extracts the spatial structure features in the original network traffic by 2-dimensional convolution neural network (2D-CNN) and uses the autoencoder structure to downscale the data so that different traffic features are represented as spectral lines in different intervals of a one-dimensional standard coordinate system, which we call FlowSpectrum. In this article, the PRuLe activation function is added to the model to ensure the stability of the training process. We use the ISCX-VPN2016 dataset to test the classification effect of FlowSpectrum model. The experimental results show that the proposed model can characterize the encrypted traffic features in a one-dimensional coordinate system and classify Non-VPN encrypted traffic with an accuracy of up to 99.2%, which is about 7% better than the state-of-the-art solution, and VPN encrypted traffic with an accuracy of 98.3%, which is about 2% better than the state-of-the-art solution.
ISSN:2376-5992
2376-5992
DOI:10.7717/peerj-cs.1635