Loading…

Utility-Aware Optimal Data Selection for Differentially Private Federated Learning in IoV

Federated learning coordinates distributed data sets to train models, which brings the significant impact of data selection on model performance. Personalized differential privacy, however, introduces heterogeneity into the vehicular data sets: the higher privacy protection may reduce the contributi...

Full description

Saved in:

Bibliographic Details
Published in:	IEEE internet of things journal 2024-10, Vol.11 (20), p.33326-33336
Main Authors:	Zhang, Jiancong, Li, Shining, Wang, Changhao
Format:	Article
Language:	English
Subjects:	Adaptive sampling Algorithms Convergence Cost function Data models Datasets Design optimization differential privacy Federated learning Heterogeneity Internet of Vehicles Noise noisy gradient descent (NGD) Optimization Privacy Sensitivity Training utility evaluation
Citations:	Items that this one cites
Online Access:	Get full text
Tags:	Add Tag No Tags, Be the first to tag this record!

Description
Summary:	Federated learning coordinates distributed data sets to train models, which brings the significant impact of data selection on model performance. Personalized differential privacy, however, introduces heterogeneity into the vehicular data sets: the higher privacy protection may reduce the contribution of local models to model convergence. Therefore, the goal of this article is to dynamically optimize the combination of data sets to tackle the heterogeneity in differential private federated learning in Internet of Vehicles. This is extremely challenging without direct data access and a visible training process. Therefore, we propose an efficient hierarchical data selection method. First, the utility is evaluated using the convergence bound derived from the noise function and the cost function. Accordingly, a collection of high-value clients is selected to maximize the potential contribution of the combination to the global model. Then, we design an optimization function based on the unknown variables within the convergence bound and develop a low-complexity algorithm to approximate the sampling probability. Meanwhile, the aggregation weight of each model is adjusted to ensure unbiased estimation. Experimental results on two real-world trajectory data sets show that the scheme can reduce the meter error by 8.90% and 15.97%, respectively, and improve the convergence speed by 23.9% and 27.1%, respectively.
ISSN:	2327-4662 2327-4662
DOI:	10.1109/JIOT.2024.3427132