Loading…
Optimizing Resource Allocation in Pipeline Parallelism for Distributed DNN Training
Deep Neural Network (DNN) models have been widely deployed in a variety of applications. Driven by privacy concerns and great improvement in the computational power of mobile devices, the idea of training machine learning models on mobile devices has become more and more important. Directly applying...
Saved in:
Main Authors: | , |
---|---|
Format: | Conference Proceeding |
Language: | English |
Subjects: | |
Online Access: | Request full text |
Tags: |
Add Tag
No Tags, Be the first to tag this record!
|
Summary: | Deep Neural Network (DNN) models have been widely deployed in a variety of applications. Driven by privacy concerns and great improvement in the computational power of mobile devices, the idea of training machine learning models on mobile devices has become more and more important. Directly applying parallel training frameworks designed for data center networks to train DNN models on mobile devices may not achieve the ideal performance, since mobile devices usually have multiple types of computation resources such as ASIC, neural engine, and FPGA. Moreover, the communication time is not negligible when training on mobile devices. With the objective of minimizing DNN training time, we propose to extend the pipeline parallelism, which can hide the communication time behind computation for DNN training by integrating the resource allocation. Fine-tuning the ratio of resources allocated to forward and backward propagation can improve resource utilization. We focus on homogeneous workers and theoretically analyze the ideal cases where resources are linearly separable. We also discuss the model partition and resource allocation for a more realistic case. Additionally, we investigate the heterogeneous worker case. Trace-based simulation results show that our scheme can efficiently reduce the time cost of a training iteration. |
---|---|
ISSN: | 2690-5965 |
DOI: | 10.1109/ICPADS56603.2022.00029 |