Loading…

Decentralized learning works: An empirical comparison of gossip learning and federated learning

Machine learning over distributed data stored by many clients has important applications in use cases where data privacy is a key concern or central data storage is not an option. Recently, federated learning was proposed to solve this problem. The assumption is that the data itself is not collected...

Full description

Saved in:
Bibliographic Details
Published in:Journal of parallel and distributed computing 2021-02, Vol.148, p.109-124
Main Authors: Hegedűs, István, Danner, Gábor, Jelasity, Márk
Format: Article
Language:English
Subjects:
Citations: Items that this one cites
Items that cite this one
Online Access:Get full text
Tags: Add Tag
No Tags, Be the first to tag this record!
Description
Summary:Machine learning over distributed data stored by many clients has important applications in use cases where data privacy is a key concern or central data storage is not an option. Recently, federated learning was proposed to solve this problem. The assumption is that the data itself is not collected centrally. In a master–worker architecture, the workers perform machine learning over their own data and the master merely aggregates the resulting models without seeing any raw data, not unlike the parameter server approach. Gossip learning is a decentralized alternative to federated learning that does not require an aggregation server or indeed any central component. The natural hypothesis is that gossip learning is strictly less efficient than federated learning due to relying on a more basic infrastructure: only message passing and no cloud resources. In this empirical study, we examine this hypothesis and we present a systematic comparison of the two approaches. The experimental scenarios include a real churn trace collected over mobile phones, continuous and bursty communication patterns, different network sizes and different distributions of the training data over the devices. We also evaluate a number of additional techniques including a compression technique based on sampling, and token account based flow control for gossip learning. We examine the aggregated cost of machine learning in both approaches. Surprisingly, the best gossip variants perform comparably to the best federated learning variants overall, so they offer a fully decentralized alternative to federated learning. •Fully decentralized machine learning is a viable alternative to federated learning.•Compression is essential for all the algorithms to achieve competitive performance.•Uneven class-label distribution over the nodes favors centralization.•For bursty communication, token-based flow control improves the convergence of gossip.
ISSN:0743-7315
1096-0848
DOI:10.1016/j.jpdc.2020.10.006