Loading…

Dynamic Multi-Branch Layers for On-Device Neural Machine Translation

With the rapid development of artificial intelligence (AI), there is a trend in moving AI applications, such as neural machine translation (NMT), from cloud to mobile devices. Constrained by limited hardware resources and battery, the performance of on-device NMT systems is far from satisfactory. In...

Full description

Saved in:
Bibliographic Details
Published in:IEEE/ACM transactions on audio, speech, and language processing speech, and language processing, 2022, Vol.30, p.958-967
Main Authors: Tan, Zhixing, Yang, Zeyuan, Zhang, Meng, Liu, Qun, Sun, Maosong, Liu, Yang
Format: Article
Language:English
Subjects:
Citations: Items that this one cites
Items that cite this one
Online Access:Get full text
Tags: Add Tag
No Tags, Be the first to tag this record!
cited_by cdi_FETCH-LOGICAL-c295t-c86c047b8876a2c2738e2a85d2f6c7565d1adb6eb54ec36015fab9ce84c699d3
cites cdi_FETCH-LOGICAL-c295t-c86c047b8876a2c2738e2a85d2f6c7565d1adb6eb54ec36015fab9ce84c699d3
container_end_page 967
container_issue
container_start_page 958
container_title IEEE/ACM transactions on audio, speech, and language processing
container_volume 30
creator Tan, Zhixing
Yang, Zeyuan
Zhang, Meng
Liu, Qun
Sun, Maosong
Liu, Yang
description With the rapid development of artificial intelligence (AI), there is a trend in moving AI applications, such as neural machine translation (NMT), from cloud to mobile devices. Constrained by limited hardware resources and battery, the performance of on-device NMT systems is far from satisfactory. Inspired by conditional computation, we propose to improve the performance of on-device NMT systems with dynamic multi-branch layers. Specifically, we design a layer-wise dynamic multi-branch network with only one branch activated during training and inference. As not all branches are activated during training, we propose shared-private reparameterization to ensure sufficient training for each branch. At almost the same computational cost, our method achieves improvements of up to 1.7 BLEU points on the WMT14 English-German translation task and 1.8 BLEU points on the WMT20 Chinese-English translation task over the Transformer model, respectively. Compared with a strong baseline that also uses multiple branches, the proposed method is up to 1.5 times faster with the same number of parameters.
doi_str_mv 10.1109/TASLP.2022.3153257
format article
fullrecord <record><control><sourceid>proquest_ieee_</sourceid><recordid>TN_cdi_ieee_primary_9729651</recordid><sourceformat>XML</sourceformat><sourcesystem>PC</sourcesystem><ieee_id>9729651</ieee_id><sourcerecordid>2637438737</sourcerecordid><originalsourceid>FETCH-LOGICAL-c295t-c86c047b8876a2c2738e2a85d2f6c7565d1adb6eb54ec36015fab9ce84c699d3</originalsourceid><addsrcrecordid>eNo9kMtOwzAQAC0EElXpD8AlEucUex3b8bG0vKSUIpG75TgbNVWaFDtF6t-T0sJp9zCzKw0ht4xOGaP6IZ99Zh9ToABTzgQHoS7ICDjoWHOaXP7toOk1mYSwoZQyqrRWyYgsFofWbmsXLfdNX8eP3rZuHWX2gD5EVeejVRsv8Lt2GL3j3tsmWlq3rluM8gENje3rrr0hV5VtAk7Oc0zy56d8_hpnq5e3-SyLHWjRxy6VjiaqSFMlLThQPEWwqSihkk4JKUpmy0JiIRJ0XFImKltoh2nipNYlH5P709md7772GHqz6fa-HT4akFwlPFVcDRScKOe7EDxWZufrrfUHw6g59jK_vcyxlzn3GqS7k1Qj4r-gFWgpGP8BKWhlsw</addsrcrecordid><sourcetype>Aggregation Database</sourcetype><iscdi>true</iscdi><recordtype>article</recordtype><pqid>2637438737</pqid></control><display><type>article</type><title>Dynamic Multi-Branch Layers for On-Device Neural Machine Translation</title><source>IEEE Electronic Library (IEL) Journals</source><source>Association for Computing Machinery:Jisc Collections:ACM OPEN Journals 2023-2025 (reading list)</source><creator>Tan, Zhixing ; Yang, Zeyuan ; Zhang, Meng ; Liu, Qun ; Sun, Maosong ; Liu, Yang</creator><creatorcontrib>Tan, Zhixing ; Yang, Zeyuan ; Zhang, Meng ; Liu, Qun ; Sun, Maosong ; Liu, Yang</creatorcontrib><description>With the rapid development of artificial intelligence (AI), there is a trend in moving AI applications, such as neural machine translation (NMT), from cloud to mobile devices. Constrained by limited hardware resources and battery, the performance of on-device NMT systems is far from satisfactory. Inspired by conditional computation, we propose to improve the performance of on-device NMT systems with dynamic multi-branch layers. Specifically, we design a layer-wise dynamic multi-branch network with only one branch activated during training and inference. As not all branches are activated during training, we propose shared-private reparameterization to ensure sufficient training for each branch. At almost the same computational cost, our method achieves improvements of up to 1.7 BLEU points on the WMT14 English-German translation task and 1.8 BLEU points on the WMT20 Chinese-English translation task over the Transformer model, respectively. Compared with a strong baseline that also uses multiple branches, the proposed method is up to 1.5 times faster with the same number of parameters.</description><identifier>ISSN: 2329-9290</identifier><identifier>EISSN: 2329-9304</identifier><identifier>DOI: 10.1109/TASLP.2022.3153257</identifier><identifier>CODEN: ITASFA</identifier><language>eng</language><publisher>Piscataway: IEEE</publisher><subject>Artificial intelligence ; Conditional computation ; decoding ; Electronic devices ; Hardware ; Machine translation ; Mobile handsets ; natural language processing ; Performance enhancement ; Performance evaluation ; Training ; Transformers ; Translations</subject><ispartof>IEEE/ACM transactions on audio, speech, and language processing, 2022, Vol.30, p.958-967</ispartof><rights>Copyright The Institute of Electrical and Electronics Engineers, Inc. (IEEE) 2022</rights><lds50>peer_reviewed</lds50><woscitedreferencessubscribed>false</woscitedreferencessubscribed><citedby>FETCH-LOGICAL-c295t-c86c047b8876a2c2738e2a85d2f6c7565d1adb6eb54ec36015fab9ce84c699d3</citedby><cites>FETCH-LOGICAL-c295t-c86c047b8876a2c2738e2a85d2f6c7565d1adb6eb54ec36015fab9ce84c699d3</cites><orcidid>0000-0002-6011-6115 ; 0000-0002-3087-242X ; 0000-0002-7000-1792 ; 0000-0002-2426-6220</orcidid></display><links><openurl>$$Topenurl_article</openurl><openurlfulltext>$$Topenurlfull_article</openurlfulltext><thumbnail>$$Tsyndetics_thumb_exl</thumbnail><linktohtml>$$Uhttps://ieeexplore.ieee.org/document/9729651$$EHTML$$P50$$Gieee$$H</linktohtml><link.rule.ids>314,780,784,4022,27922,27923,27924,54795</link.rule.ids></links><search><creatorcontrib>Tan, Zhixing</creatorcontrib><creatorcontrib>Yang, Zeyuan</creatorcontrib><creatorcontrib>Zhang, Meng</creatorcontrib><creatorcontrib>Liu, Qun</creatorcontrib><creatorcontrib>Sun, Maosong</creatorcontrib><creatorcontrib>Liu, Yang</creatorcontrib><title>Dynamic Multi-Branch Layers for On-Device Neural Machine Translation</title><title>IEEE/ACM transactions on audio, speech, and language processing</title><addtitle>TASLP</addtitle><description>With the rapid development of artificial intelligence (AI), there is a trend in moving AI applications, such as neural machine translation (NMT), from cloud to mobile devices. Constrained by limited hardware resources and battery, the performance of on-device NMT systems is far from satisfactory. Inspired by conditional computation, we propose to improve the performance of on-device NMT systems with dynamic multi-branch layers. Specifically, we design a layer-wise dynamic multi-branch network with only one branch activated during training and inference. As not all branches are activated during training, we propose shared-private reparameterization to ensure sufficient training for each branch. At almost the same computational cost, our method achieves improvements of up to 1.7 BLEU points on the WMT14 English-German translation task and 1.8 BLEU points on the WMT20 Chinese-English translation task over the Transformer model, respectively. Compared with a strong baseline that also uses multiple branches, the proposed method is up to 1.5 times faster with the same number of parameters.</description><subject>Artificial intelligence</subject><subject>Conditional computation</subject><subject>decoding</subject><subject>Electronic devices</subject><subject>Hardware</subject><subject>Machine translation</subject><subject>Mobile handsets</subject><subject>natural language processing</subject><subject>Performance enhancement</subject><subject>Performance evaluation</subject><subject>Training</subject><subject>Transformers</subject><subject>Translations</subject><issn>2329-9290</issn><issn>2329-9304</issn><fulltext>true</fulltext><rsrctype>article</rsrctype><creationdate>2022</creationdate><recordtype>article</recordtype><recordid>eNo9kMtOwzAQAC0EElXpD8AlEucUex3b8bG0vKSUIpG75TgbNVWaFDtF6t-T0sJp9zCzKw0ht4xOGaP6IZ99Zh9ToABTzgQHoS7ICDjoWHOaXP7toOk1mYSwoZQyqrRWyYgsFofWbmsXLfdNX8eP3rZuHWX2gD5EVeejVRsv8Lt2GL3j3tsmWlq3rluM8gENje3rrr0hV5VtAk7Oc0zy56d8_hpnq5e3-SyLHWjRxy6VjiaqSFMlLThQPEWwqSihkk4JKUpmy0JiIRJ0XFImKltoh2nipNYlH5P709md7772GHqz6fa-HT4akFwlPFVcDRScKOe7EDxWZufrrfUHw6g59jK_vcyxlzn3GqS7k1Qj4r-gFWgpGP8BKWhlsw</recordid><startdate>2022</startdate><enddate>2022</enddate><creator>Tan, Zhixing</creator><creator>Yang, Zeyuan</creator><creator>Zhang, Meng</creator><creator>Liu, Qun</creator><creator>Sun, Maosong</creator><creator>Liu, Yang</creator><general>IEEE</general><general>The Institute of Electrical and Electronics Engineers, Inc. (IEEE)</general><scope>97E</scope><scope>RIA</scope><scope>RIE</scope><scope>AAYXX</scope><scope>CITATION</scope><scope>7SC</scope><scope>8FD</scope><scope>JQ2</scope><scope>L7M</scope><scope>L~C</scope><scope>L~D</scope><orcidid>https://orcid.org/0000-0002-6011-6115</orcidid><orcidid>https://orcid.org/0000-0002-3087-242X</orcidid><orcidid>https://orcid.org/0000-0002-7000-1792</orcidid><orcidid>https://orcid.org/0000-0002-2426-6220</orcidid></search><sort><creationdate>2022</creationdate><title>Dynamic Multi-Branch Layers for On-Device Neural Machine Translation</title><author>Tan, Zhixing ; Yang, Zeyuan ; Zhang, Meng ; Liu, Qun ; Sun, Maosong ; Liu, Yang</author></sort><facets><frbrtype>5</frbrtype><frbrgroupid>cdi_FETCH-LOGICAL-c295t-c86c047b8876a2c2738e2a85d2f6c7565d1adb6eb54ec36015fab9ce84c699d3</frbrgroupid><rsrctype>articles</rsrctype><prefilter>articles</prefilter><language>eng</language><creationdate>2022</creationdate><topic>Artificial intelligence</topic><topic>Conditional computation</topic><topic>decoding</topic><topic>Electronic devices</topic><topic>Hardware</topic><topic>Machine translation</topic><topic>Mobile handsets</topic><topic>natural language processing</topic><topic>Performance enhancement</topic><topic>Performance evaluation</topic><topic>Training</topic><topic>Transformers</topic><topic>Translations</topic><toplevel>peer_reviewed</toplevel><toplevel>online_resources</toplevel><creatorcontrib>Tan, Zhixing</creatorcontrib><creatorcontrib>Yang, Zeyuan</creatorcontrib><creatorcontrib>Zhang, Meng</creatorcontrib><creatorcontrib>Liu, Qun</creatorcontrib><creatorcontrib>Sun, Maosong</creatorcontrib><creatorcontrib>Liu, Yang</creatorcontrib><collection>IEEE All-Society Periodicals Package (ASPP) 2005-present</collection><collection>IEEE All-Society Periodicals Package (ASPP) 1998-Present</collection><collection>IEEE Electronic Library (IEL)</collection><collection>CrossRef</collection><collection>Computer and Information Systems Abstracts</collection><collection>Technology Research Database</collection><collection>ProQuest Computer Science Collection</collection><collection>Advanced Technologies Database with Aerospace</collection><collection>Computer and Information Systems Abstracts – Academic</collection><collection>Computer and Information Systems Abstracts Professional</collection><jtitle>IEEE/ACM transactions on audio, speech, and language processing</jtitle></facets><delivery><delcategory>Remote Search Resource</delcategory><fulltext>fulltext</fulltext></delivery><addata><au>Tan, Zhixing</au><au>Yang, Zeyuan</au><au>Zhang, Meng</au><au>Liu, Qun</au><au>Sun, Maosong</au><au>Liu, Yang</au><format>journal</format><genre>article</genre><ristype>JOUR</ristype><atitle>Dynamic Multi-Branch Layers for On-Device Neural Machine Translation</atitle><jtitle>IEEE/ACM transactions on audio, speech, and language processing</jtitle><stitle>TASLP</stitle><date>2022</date><risdate>2022</risdate><volume>30</volume><spage>958</spage><epage>967</epage><pages>958-967</pages><issn>2329-9290</issn><eissn>2329-9304</eissn><coden>ITASFA</coden><abstract>With the rapid development of artificial intelligence (AI), there is a trend in moving AI applications, such as neural machine translation (NMT), from cloud to mobile devices. Constrained by limited hardware resources and battery, the performance of on-device NMT systems is far from satisfactory. Inspired by conditional computation, we propose to improve the performance of on-device NMT systems with dynamic multi-branch layers. Specifically, we design a layer-wise dynamic multi-branch network with only one branch activated during training and inference. As not all branches are activated during training, we propose shared-private reparameterization to ensure sufficient training for each branch. At almost the same computational cost, our method achieves improvements of up to 1.7 BLEU points on the WMT14 English-German translation task and 1.8 BLEU points on the WMT20 Chinese-English translation task over the Transformer model, respectively. Compared with a strong baseline that also uses multiple branches, the proposed method is up to 1.5 times faster with the same number of parameters.</abstract><cop>Piscataway</cop><pub>IEEE</pub><doi>10.1109/TASLP.2022.3153257</doi><tpages>10</tpages><orcidid>https://orcid.org/0000-0002-6011-6115</orcidid><orcidid>https://orcid.org/0000-0002-3087-242X</orcidid><orcidid>https://orcid.org/0000-0002-7000-1792</orcidid><orcidid>https://orcid.org/0000-0002-2426-6220</orcidid></addata></record>
fulltext fulltext
identifier ISSN: 2329-9290
ispartof IEEE/ACM transactions on audio, speech, and language processing, 2022, Vol.30, p.958-967
issn 2329-9290
2329-9304
language eng
recordid cdi_ieee_primary_9729651
source IEEE Electronic Library (IEL) Journals; Association for Computing Machinery:Jisc Collections:ACM OPEN Journals 2023-2025 (reading list)
subjects Artificial intelligence
Conditional computation
decoding
Electronic devices
Hardware
Machine translation
Mobile handsets
natural language processing
Performance enhancement
Performance evaluation
Training
Transformers
Translations
title Dynamic Multi-Branch Layers for On-Device Neural Machine Translation
url http://sfxeu10.hosted.exlibrisgroup.com/loughborough?ctx_ver=Z39.88-2004&ctx_enc=info:ofi/enc:UTF-8&ctx_tim=2025-01-12T01%3A43%3A11IST&url_ver=Z39.88-2004&url_ctx_fmt=infofi/fmt:kev:mtx:ctx&rfr_id=info:sid/primo.exlibrisgroup.com:primo3-Article-proquest_ieee_&rft_val_fmt=info:ofi/fmt:kev:mtx:journal&rft.genre=article&rft.atitle=Dynamic%20Multi-Branch%20Layers%20for%20On-Device%20Neural%20Machine%20Translation&rft.jtitle=IEEE/ACM%20transactions%20on%20audio,%20speech,%20and%20language%20processing&rft.au=Tan,%20Zhixing&rft.date=2022&rft.volume=30&rft.spage=958&rft.epage=967&rft.pages=958-967&rft.issn=2329-9290&rft.eissn=2329-9304&rft.coden=ITASFA&rft_id=info:doi/10.1109/TASLP.2022.3153257&rft_dat=%3Cproquest_ieee_%3E2637438737%3C/proquest_ieee_%3E%3Cgrp_id%3Ecdi_FETCH-LOGICAL-c295t-c86c047b8876a2c2738e2a85d2f6c7565d1adb6eb54ec36015fab9ce84c699d3%3C/grp_id%3E%3Coa%3E%3C/oa%3E%3Curl%3E%3C/url%3E&rft_id=info:oai/&rft_pqid=2637438737&rft_id=info:pmid/&rft_ieee_id=9729651&rfr_iscdi=true