Loading…
Elephant Flow Detection With Random Forest Models Under Programmable Network Dataplane Constraints
This paper investigates the application of tree-based machine learning classifiers for flow-based traffic engineering, focusing on the binary classification of IP network flows into mice (short flows) and elephants (long flows) using 5-tuple header fields from the first packet. Unlike prior studies...
Saved in:
Published in: | IEEE access 2024, Vol.12, p.158561-158578 |
---|---|
Main Authors: | , , , , |
Format: | Article |
Language: | English |
Subjects: | |
Citations: | Items that this one cites |
Online Access: | Get full text |
Tags: |
Add Tag
No Tags, Be the first to tag this record!
|
Summary: | This paper investigates the application of tree-based machine learning classifiers for flow-based traffic engineering, focusing on the binary classification of IP network flows into mice (short flows) and elephants (long flows) using 5-tuple header fields from the first packet. Unlike prior studies on network flow classification, our analysis uses performance metrics normalized by traffic coverage, ensuring relevance for traffic engineering and QoS applications. We also evaluate models within the constraints of programmable switching hardware, such as the Intel Tofino P4 chip. Our findings show that such constrained models can achieve high accuracy while performing inference at line rate in the dataplane. Additionally, we reveal a trade-off between tree depth and input format, with bit transformations enabling more efficient feature extraction at lower depths. Our results show that optimal tree depths range from 15 to 25 levels, depending on the input format. The most effective model employs extremely randomized trees with bit-transformed input and trees of depth 20. |
---|---|
ISSN: | 2169-3536 2169-3536 |
DOI: | 10.1109/ACCESS.2024.3485588 |