Loading…

Elephant Flow Detection With Random Forest Models Under Programmable Network Dataplane Constraints

This paper investigates the application of tree-based machine learning classifiers for flow-based traffic engineering, focusing on the binary classification of IP network flows into mice (short flows) and elephants (long flows) using 5-tuple header fields from the first packet. Unlike prior studies...

Full description

Saved in:
Bibliographic Details
Published in:IEEE access 2024, Vol.12, p.158561-158578
Main Authors: Jurkiewicz, Piotr, Kadziolka, Bartosz, Kantor, Miroslaw, Wojcik, Robert, Domzal, Jerzy
Format: Article
Language:English
Subjects:
Citations: Items that this one cites
Online Access:Get full text
Tags: Add Tag
No Tags, Be the first to tag this record!
Description
Summary:This paper investigates the application of tree-based machine learning classifiers for flow-based traffic engineering, focusing on the binary classification of IP network flows into mice (short flows) and elephants (long flows) using 5-tuple header fields from the first packet. Unlike prior studies on network flow classification, our analysis uses performance metrics normalized by traffic coverage, ensuring relevance for traffic engineering and QoS applications. We also evaluate models within the constraints of programmable switching hardware, such as the Intel Tofino P4 chip. Our findings show that such constrained models can achieve high accuracy while performing inference at line rate in the dataplane. Additionally, we reveal a trade-off between tree depth and input format, with bit transformations enabling more efficient feature extraction at lower depths. Our results show that optimal tree depths range from 15 to 25 levels, depending on the input format. The most effective model employs extremely randomized trees with bit-transformed input and trees of depth 20.
ISSN:2169-3536
2169-3536
DOI:10.1109/ACCESS.2024.3485588