Loading…

Dynamic Multi-Branch Layers for On-Device Neural Machine Translation

With the rapid development of artificial intelligence (AI), there is a trend in moving AI applications, such as neural machine translation (NMT), from cloud to mobile devices. Constrained by limited hardware resources and battery, the performance of on-device NMT systems is far from satisfactory. In...

Full description

Saved in:

Bibliographic Details
Published in:	IEEE/ACM transactions on audio, speech, and language processing speech, and language processing, 2022, Vol.30, p.958-967
Main Authors:	Tan, Zhixing, Yang, Zeyuan, Zhang, Meng, Liu, Qun, Sun, Maosong, Liu, Yang
Format:	Article
Language:	English
Subjects:	Artificial intelligence Conditional computation decoding Electronic devices Hardware Machine translation Mobile handsets natural language processing Performance enhancement Performance evaluation Training Transformers Translations
Citations:	Items that this one cites Items that cite this one
Online Access:	Get full text
Tags:	Add Tag No Tags, Be the first to tag this record!

cited_by	cdi_FETCH-LOGICAL-c295t-c86c047b8876a2c2738e2a85d2f6c7565d1adb6eb54ec36015fab9ce84c699d3
cites	cdi_FETCH-LOGICAL-c295t-c86c047b8876a2c2738e2a85d2f6c7565d1adb6eb54ec36015fab9ce84c699d3
container_end_page	967
container_issue
container_start_page	958
container_title	IEEE/ACM transactions on audio, speech, and language processing
container_volume	30
creator	Tan, Zhixing Yang, Zeyuan Zhang, Meng Liu, Qun Sun, Maosong Liu, Yang
description	With the rapid development of artificial intelligence (AI), there is a trend in moving AI applications, such as neural machine translation (NMT), from cloud to mobile devices. Constrained by limited hardware resources and battery, the performance of on-device NMT systems is far from satisfactory. Inspired by conditional computation, we propose to improve the performance of on-device NMT systems with dynamic multi-branch layers. Specifically, we design a layer-wise dynamic multi-branch network with only one branch activated during training and inference. As not all branches are activated during training, we propose shared-private reparameterization to ensure sufficient training for each branch. At almost the same computational cost, our method achieves improvements of up to 1.7 BLEU points on the WMT14 English-German translation task and 1.8 BLEU points on the WMT20 Chinese-English translation task over the Transformer model, respectively. Compared with a strong baseline that also uses multiple branches, the proposed method is up to 1.5 times faster with the same number of parameters.
doi_str_mv	10.1109/TASLP.2022.3153257
format	article
fullrecord	<record><control><sourceid>proquest_ieee_</sourceid><recordid>TN_cdi_ieee_primary_9729651</recordid><sourceformat>XML</sourceformat><sourcesystem>PC</sourcesystem><ieee_id>9729651</ieee_id><sourcerecordid>2637438737</sourcerecordid><originalsourceid>FETCH-LOGICAL-c295t-c86c047b8876a2c2738e2a85d2f6c7565d1adb6eb54ec36015fab9ce84c699d3</originalsourceid><addsrcrecordid>eNo9kMtOwzAQAC0EElXpD8AlEucUex3b8bG0vKSUIpG75TgbNVWaFDtF6t-T0sJp9zCzKw0ht4xOGaP6IZ99Zh9ToABTzgQHoS7ICDjoWHOaXP7toOk1mYSwoZQyqrRWyYgsFofWbmsXLfdNX8eP3rZuHWX2gD5EVeejVRsv8Lt2GL3j3tsmWlq3rluM8gENje3rrr0hV5VtAk7Oc0zy56d8_hpnq5e3-SyLHWjRxy6VjiaqSFMlLThQPEWwqSihkk4JKUpmy0JiIRJ0XFImKltoh2nipNYlH5P709md7772GHqz6fa-HT4akFwlPFVcDRScKOe7EDxWZufrrfUHw6g59jK_vcyxlzn3GqS7k1Qj4r-gFWgpGP8BKWhlsw</addsrcrecordid><sourcetype>Aggregation Database</sourcetype><iscdi>true</iscdi><recordtype>article</recordtype><pqid>2637438737</pqid></control><display><type>article</type><title>Dynamic Multi-Branch Layers for On-Device Neural Machine Translation</title><source>IEEE Electronic Library (IEL) Journals</source><source>Association for Computing Machinery:Jisc Collections:ACM OPEN Journals 2023-2025 (reading list)</source><creator>Tan, Zhixing ; Yang, Zeyuan ; Zhang, Meng ; Liu, Qun ; Sun, Maosong ; Liu, Yang</creator><creatorcontrib>Tan, Zhixing ; Yang, Zeyuan ; Zhang, Meng ; Liu, Qun ; Sun, Maosong ; Liu, Yang</creatorcontrib><description>With the rapid development of artificial intelligence (AI), there is a trend in moving AI applications, such as neural machine translation (NMT), from cloud to mobile devices. Constrained by limited hardware resources and battery, the performance of on-device NMT systems is far from satisfactory. Inspired by conditional computation, we propose to improve the performance of on-device NMT systems with dynamic multi-branch layers. Specifically, we design a layer-wise dynamic multi-branch network with only one branch activated during training and inference. As not all branches are activated during training, we propose shared-private reparameterization to ensure sufficient training for each branch. At almost the same computational cost, our method achieves improvements of up to 1.7 BLEU points on the WMT14 English-German translation task and 1.8 BLEU points on the WMT20 Chinese-English translation task over the Transformer model, respectively. Compared with a strong baseline that also uses multiple branches, the proposed method is up to 1.5 times faster with the same number of parameters.</description><identifier>ISSN: 2329-9290</identifier><identifier>EISSN: 2329-9304</identifier><identifier>DOI: 10.1109/TASLP.2022.3153257</identifier><identifier>CODEN: ITASFA</identifier><language>eng</language><publisher>Piscataway: IEEE</publisher><subject>Artificial intelligence ; Conditional computation ; decoding ; Electronic devices ; Hardware ; Machine translation ; Mobile handsets ; natural language processing ; Performance enhancement ; Performance evaluation ; Training ; Transformers ; Translations</subject><ispartof>IEEE/ACM transactions on audio, speech, and language processing, 2022, Vol.30, p.958-967</ispartof><rights>Copyright The Institute of Electrical and Electronics Engineers, Inc. (IEEE) 2022</rights><lds50>peer_reviewed</lds50><woscitedreferencessubscribed>false</woscitedreferencessubscribed><citedby>FETCH-LOGICAL-c295t-c86c047b8876a2c2738e2a85d2f6c7565d1adb6eb54ec36015fab9ce84c699d3</citedby><cites>FETCH-LOGICAL-c295t-c86c047b8876a2c2738e2a85d2f6c7565d1adb6eb54ec36015fab9ce84c699d3</cites><orcidid>0000-0002-6011-6115 ; 0000-0002-3087-242X ; 0000-0002-7000-1792 ; 0000-0002-2426-6220</orcidid></display><links><openurl>$$Topenurl_article</openurl><openurlfulltext>$$Topenurlfull_article</openurlfulltext><thumbnail>$$Tsyndetics_thumb_exl</thumbnail><linktohtml>$$Uhttps://ieeexplore.ieee.org/document/9729651$$EHTML$$P50$$Gieee$$H</linktohtml><link.rule.ids>314,780,784,4022,27922,27923,27924,54795</link.rule.ids></links><search><creatorcontrib>Tan, Zhixing</creatorcontrib><creatorcontrib>Yang, Zeyuan</creatorcontrib><creatorcontrib>Zhang, Meng</creatorcontrib><creatorcontrib>Liu, Qun</creatorcontrib><creatorcontrib>Sun, Maosong</creatorcontrib><creatorcontrib>Liu, Yang</creatorcontrib><title>Dynamic Multi-Branch Layers for On-Device Neural Machine Translation</title><title>IEEE/ACM transactions on audio, speech, and language processing</title><addtitle>TASLP</addtitle><description>With the rapid development of artificial intelligence (AI), there is a trend in moving AI applications, such as neural machine translation (NMT), from cloud to mobile devices. Constrained by limited hardware resources and battery, the performance of on-device NMT systems is far from satisfactory. Inspired by conditional computation, we propose to improve the performance of on-device NMT systems with dynamic multi-branch layers. Specifically, we design a layer-wise dynamic multi-branch network with only one branch activated during training and inference. As not all branches are activated during training, we propose shared-private reparameterization to ensure sufficient training for each branch. At almost the same computational cost, our method achieves improvements of up to 1.7 BLEU points on the WMT14 English-German translation task and 1.8 BLEU points on the WMT20 Chinese-English translation task over the Transformer model, respectively. Compared with a strong baseline that also uses multiple branches, the proposed method is up to 1.5 times faster with the same number of parameters.</description><subject>Artificial intelligence</subject><subject>Conditional computation</subject><subject>decoding</subject><subject>Electronic devices</subject><subject>Hardware</subject><subject>Machine translation</subject><subject>Mobile handsets</subject><subject>natural language processing</subject><subject>Performance enhancement</subject><subject>Performance evaluation</subject><subject>Training</subject><subject>Transformers</subject><subject>Translations</subject><issn>2329-9290</issn><issn>2329-9304</issn><fulltext>true</fulltext><rsrctype>article</rsrctype><creationdate>2022</creationdate><recordtype>article</recordtype><recordid>eNo9kMtOwzAQAC0EElXpD8AlEucUex3b8bG0vKSUIpG75TgbNVWaFDtF6t-T0sJp9zCzKw0ht4xOGaP6IZ99Zh9ToABTzgQHoS7ICDjoWHOaXP7toOk1mYSwoZQyqrRWyYgsFofWbmsXLfdNX8eP3rZuHWX2gD5EVeejVRsv8Lt2GL3j3tsmWlq3rluM8gENje3rrr0hV5VtAk7Oc0zy56d8_hpnq5e3-SyLHWjRxy6VjiaqSFMlLThQPEWwqSihkk4JKUpmy0JiIRJ0XFImKltoh2nipNYlH5P709md7772GHqz6fa-HT4akFwlPFVcDRScKOe7EDxWZufrrfUHw6g59jK_vcyxlzn3GqS7k1Qj4r-gFWgpGP8BKWhlsw</recordid><startdate>2022</startdate><enddate>2022</enddate><creator>Tan, Zhixing</creator><creator>Yang, Zeyuan</creator><creator>Zhang, Meng</creator><creator>Liu, Qun</creator><creator>Sun, Maosong</creator><creator>Liu, Yang</creator><general>IEEE</general><general>The Institute of Electrical and Electronics Engineers, Inc. (IEEE)</general><scope>97E</scope><scope>RIA</scope><scope>RIE</scope><scope>AAYXX</scope><scope>CITATION</scope><scope>7SC</scope><scope>8FD</scope><scope>JQ2</scope><scope>L7M</scope><scope>L~C</scope><scope>L~D</scope><orcidid>https://orcid.org/0000-0002-6011-6115</orcidid><orcidid>https://orcid.org/0000-0002-3087-242X</orcidid><orcidid>https://orcid.org/0000-0002-7000-1792</orcidid><orcidid>https://orcid.org/0000-0002-2426-6220</orcidid></search><sort><creationdate>2022</creationdate><title>Dynamic Multi-Branch Layers for On-Device Neural Machine Translation</title><author>Tan, Zhixing ; Yang, Zeyuan ; Zhang, Meng ; Liu, Qun ; Sun, Maosong ; Liu, Yang</author></sort><facets><frbrtype>5</frbrtype><frbrgroupid>cdi_FETCH-LOGICAL-c295t-c86c047b8876a2c2738e2a85d2f6c7565d1adb6eb54ec36015fab9ce84c699d3</frbrgroupid><rsrctype>articles</rsrctype><prefilter>articles</prefilter><language>eng</language><creationdate>2022</creationdate><topic>Artificial intelligence</topic><topic>Conditional computation</topic><topic>decoding</topic><topic>Electronic devices</topic><topic>Hardware</topic><topic>Machine translation</topic><topic>Mobile handsets</topic><topic>natural language processing</topic><topic>Performance enhancement</topic><topic>Performance evaluation</topic><topic>Training</topic><topic>Transformers</topic><topic>Translations</topic><toplevel>peer_reviewed</toplevel><toplevel>online_resources</toplevel><creatorcontrib>Tan, Zhixing</creatorcontrib><creatorcontrib>Yang, Zeyuan</creatorcontrib><creatorcontrib>Zhang, Meng</creatorcontrib><creatorcontrib>Liu, Qun</creatorcontrib><creatorcontrib>Sun, Maosong</creatorcontrib><creatorcontrib>Liu, Yang</creatorcontrib><collection>IEEE All-Society Periodicals Package (ASPP) 2005-present</collection><collection>IEEE All-Society Periodicals Package (ASPP) 1998-Present</collection><collection>IEEE Electronic Library (IEL)</collection><collection>CrossRef</collection><collection>Computer and Information Systems Abstracts</collection><collection>Technology Research Database</collection><collection>ProQuest Computer Science Collection</collection><collection>Advanced Technologies Database with Aerospace</collection><collection>Computer and Information Systems Abstracts Academic</collection><collection>Computer and Information Systems Abstracts Professional</collection><jtitle>IEEE/ACM transactions on audio, speech, and language processing</jtitle></facets><delivery><delcategory>Remote Search Resource</delcategory><fulltext>fulltext</fulltext></delivery><addata><au>Tan, Zhixing</au><au>Yang, Zeyuan</au><au>Zhang, Meng</au><au>Liu, Qun</au><au>Sun, Maosong</au><au>Liu, Yang</au><format>journal</format><genre>article</genre><ristype>JOUR</ristype><atitle>Dynamic Multi-Branch Layers for On-Device Neural Machine Translation</atitle><jtitle>IEEE/ACM transactions on audio, speech, and language processing</jtitle><stitle>TASLP</stitle><date>2022</date><risdate>2022</risdate><volume>30</volume><spage>958</spage><epage>967</epage><pages>958-967</pages><issn>2329-9290</issn><eissn>2329-9304</eissn><coden>ITASFA</coden><abstract>With the rapid development of artificial intelligence (AI), there is a trend in moving AI applications, such as neural machine translation (NMT), from cloud to mobile devices. Constrained by limited hardware resources and battery, the performance of on-device NMT systems is far from satisfactory. Inspired by conditional computation, we propose to improve the performance of on-device NMT systems with dynamic multi-branch layers. Specifically, we design a layer-wise dynamic multi-branch network with only one branch activated during training and inference. As not all branches are activated during training, we propose shared-private reparameterization to ensure sufficient training for each branch. At almost the same computational cost, our method achieves improvements of up to 1.7 BLEU points on the WMT14 English-German translation task and 1.8 BLEU points on the WMT20 Chinese-English translation task over the Transformer model, respectively. Compared with a strong baseline that also uses multiple branches, the proposed method is up to 1.5 times faster with the same number of parameters.</abstract><cop>Piscataway</cop><pub>IEEE</pub><doi>10.1109/TASLP.2022.3153257</doi><tpages>10</tpages><orcidid>https://orcid.org/0000-0002-6011-6115</orcidid><orcidid>https://orcid.org/0000-0002-3087-242X</orcidid><orcidid>https://orcid.org/0000-0002-7000-1792</orcidid><orcidid>https://orcid.org/0000-0002-2426-6220</orcidid></addata></record>
fulltext	fulltext
identifier	ISSN: 2329-9290
ispartof	IEEE/ACM transactions on audio, speech, and language processing, 2022, Vol.30, p.958-967
issn	2329-9290 2329-9304
language	eng
recordid	cdi_ieee_primary_9729651
source	IEEE Electronic Library (IEL) Journals; Association for Computing Machinery:Jisc Collections:ACM OPEN Journals 2023-2025 (reading list)
subjects	Artificial intelligence Conditional computation decoding Electronic devices Hardware Machine translation Mobile handsets natural language processing Performance enhancement Performance evaluation Training Transformers Translations
title	Dynamic Multi-Branch Layers for On-Device Neural Machine Translation
url	http://sfxeu10.hosted.exlibrisgroup.com/loughborough?ctx_ver=Z39.88-2004&ctx_enc=info:ofi/enc:UTF-8&ctx_tim=2025-01-12T01%3A43%3A11IST&url_ver=Z39.88-2004&url_ctx_fmt=infofi/fmt:kev:mtx:ctx&rfr_id=info:sid/primo.exlibrisgroup.com:primo3-Article-proquest_ieee_&rft_val_fmt=info:ofi/fmt:kev:mtx:journal&rft.genre=article&rft.atitle=Dynamic%20Multi-Branch%20Layers%20for%20On-Device%20Neural%20Machine%20Translation&rft.jtitle=IEEE/ACM%20transactions%20on%20audio,%20speech,%20and%20language%20processing&rft.au=Tan,%20Zhixing&rft.date=2022&rft.volume=30&rft.spage=958&rft.epage=967&rft.pages=958-967&rft.issn=2329-9290&rft.eissn=2329-9304&rft.coden=ITASFA&rft_id=info:doi/10.1109/TASLP.2022.3153257&rft_dat=%3Cproquest_ieee_%3E2637438737%3C/proquest_ieee_%3E%3Cgrp_id%3Ecdi_FETCH-LOGICAL-c295t-c86c047b8876a2c2738e2a85d2f6c7565d1adb6eb54ec36015fab9ce84c699d3%3C/grp_id%3E%3Coa%3E%3C/oa%3E%3Curl%3E%3C/url%3E&rft_id=info:oai/&rft_pqid=2637438737&rft_id=info:pmid/&rft_ieee_id=9729651&rfr_iscdi=true