Loading…

Dynamic Multi-Branch Layers for On-Device Neural Machine Translation

With the rapid development of artificial intelligence (AI), there is a trend in moving AI applications, such as neural machine translation (NMT), from cloud to mobile devices. Constrained by limited hardware resources and battery, the performance of on-device NMT systems is far from satisfactory. In...

Full description

Saved in:
Bibliographic Details
Published in:IEEE/ACM transactions on audio, speech, and language processing speech, and language processing, 2022, Vol.30, p.958-967
Main Authors: Tan, Zhixing, Yang, Zeyuan, Zhang, Meng, Liu, Qun, Sun, Maosong, Liu, Yang
Format: Article
Language:English
Subjects:
Citations: Items that this one cites
Items that cite this one
Online Access:Get full text
Tags: Add Tag
No Tags, Be the first to tag this record!
Description
Summary:With the rapid development of artificial intelligence (AI), there is a trend in moving AI applications, such as neural machine translation (NMT), from cloud to mobile devices. Constrained by limited hardware resources and battery, the performance of on-device NMT systems is far from satisfactory. Inspired by conditional computation, we propose to improve the performance of on-device NMT systems with dynamic multi-branch layers. Specifically, we design a layer-wise dynamic multi-branch network with only one branch activated during training and inference. As not all branches are activated during training, we propose shared-private reparameterization to ensure sufficient training for each branch. At almost the same computational cost, our method achieves improvements of up to 1.7 BLEU points on the WMT14 English-German translation task and 1.8 BLEU points on the WMT20 Chinese-English translation task over the Transformer model, respectively. Compared with a strong baseline that also uses multiple branches, the proposed method is up to 1.5 times faster with the same number of parameters.
ISSN:2329-9290
2329-9304
DOI:10.1109/TASLP.2022.3153257