Loading…

Unveiling the Effects of Slightly Skewed Labels on Traffic Data Analysis

Data heterogeneity is a prevalent challenge in intelligent transportation systems (ITS), often arising from variations in traffic patterns across different regions or time periods. For instance, certain traffic events, such as congestion, may be more frequent in urban areas during peak hours, while...

Full description

Saved in:
Bibliographic Details
Published in:IEEE transactions on intelligent transportation systems 2024-10, p.1-10
Main Authors: Pei, Jiaming, Li, Wei
Format: Article
Language:English
Subjects:
Online Access:Get full text
Tags: Add Tag
No Tags, Be the first to tag this record!
Description
Summary:Data heterogeneity is a prevalent challenge in intelligent transportation systems (ITS), often arising from variations in traffic patterns across different regions or time periods. For instance, certain traffic events, such as congestion, may be more frequent in urban areas during peak hours, while other events, like accidents, might occur more often in suburban regions, leading to slightly skewed label distributions. While federated learning provides an effective solution for distributed data, its performance can degrade when client datasets exhibit such label skew. To address this, we propose a strategy that combines Gaussian mixture clustering with oversampling. Gaussian mixture clustering can handle overlapping data points in model parameters, but insufficient client samples may limit local model training. To overcome this, we introduce a Gaussian mixture-based oversampling method to generate additional samples, enhancing the robustness of federated learning under slightly skewed label scenarios. Our experiments demonstrate that this method outperforms or matches existing approaches, ensuring more reliable and accurate ITS applications.
ISSN:1524-9050
1558-0016
DOI:10.1109/TITS.2024.3483234