Loading…

Network-Density-Controlled Decentralized Parallel Stochastic Gradient Descent in Wireless Systems

This paper proposes a communication strategy for decentralized learning on wireless systems. Our discussion is based on the decentralized parallel stochastic gradient descent (D-PSGD), which is one of the state-of-the-art algorithms for decentralized learning. The main contribution of this paper is...

Full description

Saved in:
Bibliographic Details
Main Authors: Sato, Koya, Satoh, Yasuyuki, Sugimura, Daisuke
Format: Conference Proceeding
Language:English
Subjects:
Online Access:Request full text
Tags: Add Tag
No Tags, Be the first to tag this record!
Description
Summary:This paper proposes a communication strategy for decentralized learning on wireless systems. Our discussion is based on the decentralized parallel stochastic gradient descent (D-PSGD), which is one of the state-of-the-art algorithms for decentralized learning. The main contribution of this paper is to raise a novel open question for decentralized learning on wireless systems: there is a possibility that the density of a network topology significantly influences the runtime performance of DPSGD. In general, it is difficult to guarantee delay-free communications without any communication deterioration in real wireless network systems because of path loss and multi-path fading. These factors significantly degrade the runtime performance of D-PSGD. To alleviate such problems, we first analyze the runtime performance of D-PSGD by considering real wireless systems. This analysis yields the key insights that dense network topology (1) does not significantly gain the training accuracy of D-PSGD compared to sparse one, and (2) strongly degrades the runtime performance because this setting generally requires to utilize a low-rate transmission. Based on these findings, we propose a novel communication strategy, in which each node estimates optimal transmission rates such that communication time during the D-PSGD optimization is minimized under the constraint of network density, which is characterized by radio propagation property. The proposed strategy enables to improve the runtime performance of D-PSGD in wireless systems. Numerical simulations reveal that the proposed strategy is capable of enhancing the runtime performance of D-PSGD.
ISSN:1938-1883
DOI:10.1109/ICC40277.2020.9149125