Loading…

Bilingual attention based neural machine translation

In recent years, Recurrent Neural Network based Neural Machine Translation (RNN-based NMT) equipped with an attention mechanism from the decoder to encoder, has achieved great advancements and exhibited good performance in many language pairs. However, little work has been done on the attention mech...

Full description

Saved in:

Bibliographic Details
Published in:	Applied intelligence (Dordrecht, Netherlands) Netherlands), 2023-02, Vol.53 (4), p.4302-4315
Main Authors:	Kang, Liyan, He, Shaojie, Wang, Mingxuan, Long, Fei, Su, Jinsong
Format:	Article
Language:	English
Subjects:	Artificial Intelligence Back propagation Bilingualism Coders Computer Science Decoding Dynamic control English language Machine translation Machines Manufacturing Mechanical Engineering Processes Recurrent neural networks Translations
Citations:	Items that this one cites Items that cite this one
Online Access:	Get full text
Tags:	Add Tag No Tags, Be the first to tag this record!

cited_by	cdi_FETCH-LOGICAL-c249t-3a7d6bf49f530373d43f7ac21efb88d338a37405c2f7bf726c774a2d21aa817d3
cites	cdi_FETCH-LOGICAL-c249t-3a7d6bf49f530373d43f7ac21efb88d338a37405c2f7bf726c774a2d21aa817d3
container_end_page	4315
container_issue	4
container_start_page	4302
container_title	Applied intelligence (Dordrecht, Netherlands)
container_volume	53
creator	Kang, Liyan He, Shaojie Wang, Mingxuan Long, Fei Su, Jinsong
description	In recent years, Recurrent Neural Network based Neural Machine Translation (RNN-based NMT) equipped with an attention mechanism from the decoder to encoder, has achieved great advancements and exhibited good performance in many language pairs. However, little work has been done on the attention mechanism for the target side, which has the potential to further improve NMT. To address this issue, in this paper, we propose a novel bilingual attention based NMT, where its bilingual attention mechanism exploits decoding history and enables the NMT model to better dynamically select and exploit source side and target side information. Compared with previous RNN-based NMT models, our model has two advantages: First, our model exercises a dynamic control over the ratios at which source and target contexts respectively contribute to the generation of the next target word. In this way, the weakly induced structure relations on both sides can be exploited for NMT. Second, through short-cut connections, the training errors of our model can be directly back-propagated, which effectively alleviates the gradient vanishing or exploding issue. Experimental results and in-depth analyses on Chinese-English, English-German, and English-French translation tasks show that our model with proper configurations can significantly surpass the dominant NMT model, Transformer. Particularly, our proposed model has won the first prize in the English-Chinese translation task of WMT2018.
doi_str_mv	10.1007/s10489-022-03563-8
format	article
fullrecord	<record><control><sourceid>proquest_cross</sourceid><recordid>TN_cdi_proquest_journals_2771495674</recordid><sourceformat>XML</sourceformat><sourcesystem>PC</sourcesystem><sourcerecordid>2771495674</sourcerecordid><originalsourceid>FETCH-LOGICAL-c249t-3a7d6bf49f530373d43f7ac21efb88d338a37405c2f7bf726c774a2d21aa817d3</originalsourceid><addsrcrecordid>eNp9kE1LxDAURYMoOI7-AVcF19EkL-1Llzr4BQNuFNyF1zYZO3TSMWkX_ns7VnDn6sHl3PvgMHYpxbUUAm-SFNqUXCjFBeQFcHPEFjJH4KhLPGYLUSrNi6J8P2VnKW2FEABCLpi-a7s2bEbqMhoGF4a2D1lFyTVZcGOc4h3VH21w2RAppI4OwDk78dQld_F7l-zt4f519cTXL4_Pq9s1r5UuBw6ETVF5XfocBCA0GjxSraTzlTENgCFALfJaeaw8qqJG1KQaJYmMxAaW7Gre3cf-c3RpsNt-jGF6aRWi1GVeoJ4oNVN17FOKztt9bHcUv6wU9mDHznbsZMf-2LFmKsFcShMcNi7-Tf_T-gaTR2c0</addsrcrecordid><sourcetype>Aggregation Database</sourcetype><iscdi>true</iscdi><recordtype>article</recordtype><pqid>2771495674</pqid></control><display><type>article</type><title>Bilingual attention based neural machine translation</title><source>ABI/INFORM Global</source><source>Springer Nature</source><creator>Kang, Liyan ; He, Shaojie ; Wang, Mingxuan ; Long, Fei ; Su, Jinsong</creator><creatorcontrib>Kang, Liyan ; He, Shaojie ; Wang, Mingxuan ; Long, Fei ; Su, Jinsong</creatorcontrib><description>In recent years, Recurrent Neural Network based Neural Machine Translation (RNN-based NMT) equipped with an attention mechanism from the decoder to encoder, has achieved great advancements and exhibited good performance in many language pairs. However, little work has been done on the attention mechanism for the target side, which has the potential to further improve NMT. To address this issue, in this paper, we propose a novel bilingual attention based NMT, where its bilingual attention mechanism exploits decoding history and enables the NMT model to better dynamically select and exploit source side and target side information. Compared with previous RNN-based NMT models, our model has two advantages: First, our model exercises a dynamic control over the ratios at which source and target contexts respectively contribute to the generation of the next target word. In this way, the weakly induced structure relations on both sides can be exploited for NMT. Second, through short-cut connections, the training errors of our model can be directly back-propagated, which effectively alleviates the gradient vanishing or exploding issue. Experimental results and in-depth analyses on Chinese-English, English-German, and English-French translation tasks show that our model with proper configurations can significantly surpass the dominant NMT model, Transformer. Particularly, our proposed model has won the first prize in the English-Chinese translation task of WMT2018.</description><identifier>ISSN: 0924-669X</identifier><identifier>EISSN: 1573-7497</identifier><identifier>DOI: 10.1007/s10489-022-03563-8</identifier><language>eng</language><publisher>New York: Springer US</publisher><subject>Artificial Intelligence ; Back propagation ; Bilingualism ; Coders ; Computer Science ; Decoding ; Dynamic control ; English language ; Machine translation ; Machines ; Manufacturing ; Mechanical Engineering ; Processes ; Recurrent neural networks ; Translations</subject><ispartof>Applied intelligence (Dordrecht, Netherlands), 2023-02, Vol.53 (4), p.4302-4315</ispartof><rights>The Author(s), under exclusive licence to Springer Science+Business Media, LLC, part of Springer Nature 2022</rights><rights>The Author(s), under exclusive licence to Springer Science+Business Media, LLC, part of Springer Nature 2022.</rights><lds50>peer_reviewed</lds50><woscitedreferencessubscribed>false</woscitedreferencessubscribed><citedby>FETCH-LOGICAL-c249t-3a7d6bf49f530373d43f7ac21efb88d338a37405c2f7bf726c774a2d21aa817d3</citedby><cites>FETCH-LOGICAL-c249t-3a7d6bf49f530373d43f7ac21efb88d338a37405c2f7bf726c774a2d21aa817d3</cites></display><links><openurl>$$Topenurl_article</openurl><openurlfulltext>$$Topenurlfull_article</openurlfulltext><thumbnail>$$Tsyndetics_thumb_exl</thumbnail><linktopdf>$$Uhttps://www.proquest.com/docview/2771495674/fulltextPDF?pq-origsite=primo$$EPDF$$P50$$Gproquest$$H</linktopdf><linktohtml>$$Uhttps://www.proquest.com/docview/2771495674?pq-origsite=primo$$EHTML$$P50$$Gproquest$$H</linktohtml><link.rule.ids>314,780,784,11688,27924,27925,36060,44363,74895</link.rule.ids></links><search><creatorcontrib>Kang, Liyan</creatorcontrib><creatorcontrib>He, Shaojie</creatorcontrib><creatorcontrib>Wang, Mingxuan</creatorcontrib><creatorcontrib>Long, Fei</creatorcontrib><creatorcontrib>Su, Jinsong</creatorcontrib><title>Bilingual attention based neural machine translation</title><title>Applied intelligence (Dordrecht, Netherlands)</title><addtitle>Appl Intell</addtitle><description>In recent years, Recurrent Neural Network based Neural Machine Translation (RNN-based NMT) equipped with an attention mechanism from the decoder to encoder, has achieved great advancements and exhibited good performance in many language pairs. However, little work has been done on the attention mechanism for the target side, which has the potential to further improve NMT. To address this issue, in this paper, we propose a novel bilingual attention based NMT, where its bilingual attention mechanism exploits decoding history and enables the NMT model to better dynamically select and exploit source side and target side information. Compared with previous RNN-based NMT models, our model has two advantages: First, our model exercises a dynamic control over the ratios at which source and target contexts respectively contribute to the generation of the next target word. In this way, the weakly induced structure relations on both sides can be exploited for NMT. Second, through short-cut connections, the training errors of our model can be directly back-propagated, which effectively alleviates the gradient vanishing or exploding issue. Experimental results and in-depth analyses on Chinese-English, English-German, and English-French translation tasks show that our model with proper configurations can significantly surpass the dominant NMT model, Transformer. Particularly, our proposed model has won the first prize in the English-Chinese translation task of WMT2018.</description><subject>Artificial Intelligence</subject><subject>Back propagation</subject><subject>Bilingualism</subject><subject>Coders</subject><subject>Computer Science</subject><subject>Decoding</subject><subject>Dynamic control</subject><subject>English language</subject><subject>Machine translation</subject><subject>Machines</subject><subject>Manufacturing</subject><subject>Mechanical Engineering</subject><subject>Processes</subject><subject>Recurrent neural networks</subject><subject>Translations</subject><issn>0924-669X</issn><issn>1573-7497</issn><fulltext>true</fulltext><rsrctype>article</rsrctype><creationdate>2023</creationdate><recordtype>article</recordtype><sourceid>M0C</sourceid><recordid>eNp9kE1LxDAURYMoOI7-AVcF19EkL-1Llzr4BQNuFNyF1zYZO3TSMWkX_ns7VnDn6sHl3PvgMHYpxbUUAm-SFNqUXCjFBeQFcHPEFjJH4KhLPGYLUSrNi6J8P2VnKW2FEABCLpi-a7s2bEbqMhoGF4a2D1lFyTVZcGOc4h3VH21w2RAppI4OwDk78dQld_F7l-zt4f519cTXL4_Pq9s1r5UuBw6ETVF5XfocBCA0GjxSraTzlTENgCFALfJaeaw8qqJG1KQaJYmMxAaW7Gre3cf-c3RpsNt-jGF6aRWi1GVeoJ4oNVN17FOKztt9bHcUv6wU9mDHznbsZMf-2LFmKsFcShMcNi7-Tf_T-gaTR2c0</recordid><startdate>20230201</startdate><enddate>20230201</enddate><creator>Kang, Liyan</creator><creator>He, Shaojie</creator><creator>Wang, Mingxuan</creator><creator>Long, Fei</creator><creator>Su, Jinsong</creator><general>Springer US</general><general>Springer Nature B.V</general><scope>AAYXX</scope><scope>CITATION</scope><scope>3V.</scope><scope>7SC</scope><scope>7WY</scope><scope>7WZ</scope><scope>7XB</scope><scope>87Z</scope><scope>8AL</scope><scope>8FD</scope><scope>8FE</scope><scope>8FG</scope><scope>8FK</scope><scope>8FL</scope><scope>ABJCF</scope><scope>ABUWG</scope><scope>AFKRA</scope><scope>ARAPS</scope><scope>AZQEC</scope><scope>BENPR</scope><scope>BEZIV</scope><scope>BGLVJ</scope><scope>CCPQU</scope><scope>DWQXO</scope><scope>FRNLG</scope><scope>F~G</scope><scope>GNUQQ</scope><scope>HCIFZ</scope><scope>JQ2</scope><scope>K60</scope><scope>K6~</scope><scope>K7-</scope><scope>L.-</scope><scope>L6V</scope><scope>L7M</scope><scope>L~C</scope><scope>L~D</scope><scope>M0C</scope><scope>M0N</scope><scope>M7S</scope><scope>P5Z</scope><scope>P62</scope><scope>PQBIZ</scope><scope>PQBZA</scope><scope>PQEST</scope><scope>PQQKQ</scope><scope>PQUKI</scope><scope>PSYQQ</scope><scope>PTHSS</scope><scope>Q9U</scope></search><sort><creationdate>20230201</creationdate><title>Bilingual attention based neural machine translation</title><author>Kang, Liyan ; He, Shaojie ; Wang, Mingxuan ; Long, Fei ; Su, Jinsong</author></sort><facets><frbrtype>5</frbrtype><frbrgroupid>cdi_FETCH-LOGICAL-c249t-3a7d6bf49f530373d43f7ac21efb88d338a37405c2f7bf726c774a2d21aa817d3</frbrgroupid><rsrctype>articles</rsrctype><prefilter>articles</prefilter><language>eng</language><creationdate>2023</creationdate><topic>Artificial Intelligence</topic><topic>Back propagation</topic><topic>Bilingualism</topic><topic>Coders</topic><topic>Computer Science</topic><topic>Decoding</topic><topic>Dynamic control</topic><topic>English language</topic><topic>Machine translation</topic><topic>Machines</topic><topic>Manufacturing</topic><topic>Mechanical Engineering</topic><topic>Processes</topic><topic>Recurrent neural networks</topic><topic>Translations</topic><toplevel>peer_reviewed</toplevel><toplevel>online_resources</toplevel><creatorcontrib>Kang, Liyan</creatorcontrib><creatorcontrib>He, Shaojie</creatorcontrib><creatorcontrib>Wang, Mingxuan</creatorcontrib><creatorcontrib>Long, Fei</creatorcontrib><creatorcontrib>Su, Jinsong</creatorcontrib><collection>CrossRef</collection><collection>ProQuest Central (Corporate)</collection><collection>Computer and Information Systems Abstracts</collection><collection>ABI/INFORM Collection</collection><collection>ABI/INFORM Global (PDF only)</collection><collection>ProQuest Central (purchase pre-March 2016)</collection><collection>ABI/INFORM Collection</collection><collection>Computing Database (Alumni Edition)</collection><collection>Technology Research Database</collection><collection>ProQuest SciTech Collection</collection><collection>ProQuest Technology Collection</collection><collection>ProQuest Central (Alumni) (purchase pre-March 2016)</collection><collection>ABI/INFORM Collection (Alumni Edition)</collection><collection>Materials Science & Engineering Collection</collection><collection>ProQuest Central (Alumni)</collection><collection>ProQuest Central</collection><collection>Advanced Technologies & Aerospace Collection</collection><collection>ProQuest Central Essentials</collection><collection>ProQuest Central</collection><collection>Business Premium Collection</collection><collection>Technology Collection</collection><collection>ProQuest One Community College</collection><collection>ProQuest Central</collection><collection>Business Premium Collection (Alumni)</collection><collection>ABI/INFORM Global (Corporate)</collection><collection>ProQuest Central Student</collection><collection>SciTech Premium Collection</collection><collection>ProQuest Computer Science Collection</collection><collection>ProQuest Business Collection (Alumni Edition)</collection><collection>ProQuest Business Collection</collection><collection>Computer Science Database</collection><collection>ABI/INFORM Professional Advanced</collection><collection>ProQuest Engineering Collection</collection><collection>Advanced Technologies Database with Aerospace</collection><collection>Computer and Information Systems Abstracts Academic</collection><collection>Computer and Information Systems Abstracts Professional</collection><collection>ABI/INFORM Global</collection><collection>Computing Database</collection><collection>Engineering Database</collection><collection>Advanced Technologies & Aerospace Database</collection><collection>ProQuest Advanced Technologies & Aerospace Collection</collection><collection>One Business</collection><collection>ProQuest One Business (Alumni)</collection><collection>ProQuest One Academic Eastern Edition (DO NOT USE)</collection><collection>ProQuest One Academic</collection><collection>ProQuest One Academic UKI Edition</collection><collection>ProQuest One Psychology</collection><collection>Engineering Collection</collection><collection>ProQuest Central Basic</collection><jtitle>Applied intelligence (Dordrecht, Netherlands)</jtitle></facets><delivery><delcategory>Remote Search Resource</delcategory><fulltext>fulltext</fulltext></delivery><addata><au>Kang, Liyan</au><au>He, Shaojie</au><au>Wang, Mingxuan</au><au>Long, Fei</au><au>Su, Jinsong</au><format>journal</format><genre>article</genre><ristype>JOUR</ristype><atitle>Bilingual attention based neural machine translation</atitle><jtitle>Applied intelligence (Dordrecht, Netherlands)</jtitle><stitle>Appl Intell</stitle><date>2023-02-01</date><risdate>2023</risdate><volume>53</volume><issue>4</issue><spage>4302</spage><epage>4315</epage><pages>4302-4315</pages><issn>0924-669X</issn><eissn>1573-7497</eissn><abstract>In recent years, Recurrent Neural Network based Neural Machine Translation (RNN-based NMT) equipped with an attention mechanism from the decoder to encoder, has achieved great advancements and exhibited good performance in many language pairs. However, little work has been done on the attention mechanism for the target side, which has the potential to further improve NMT. To address this issue, in this paper, we propose a novel bilingual attention based NMT, where its bilingual attention mechanism exploits decoding history and enables the NMT model to better dynamically select and exploit source side and target side information. Compared with previous RNN-based NMT models, our model has two advantages: First, our model exercises a dynamic control over the ratios at which source and target contexts respectively contribute to the generation of the next target word. In this way, the weakly induced structure relations on both sides can be exploited for NMT. Second, through short-cut connections, the training errors of our model can be directly back-propagated, which effectively alleviates the gradient vanishing or exploding issue. Experimental results and in-depth analyses on Chinese-English, English-German, and English-French translation tasks show that our model with proper configurations can significantly surpass the dominant NMT model, Transformer. Particularly, our proposed model has won the first prize in the English-Chinese translation task of WMT2018.</abstract><cop>New York</cop><pub>Springer US</pub><doi>10.1007/s10489-022-03563-8</doi><tpages>14</tpages></addata></record>
fulltext	fulltext
identifier	ISSN: 0924-669X
ispartof	Applied intelligence (Dordrecht, Netherlands), 2023-02, Vol.53 (4), p.4302-4315
issn	0924-669X 1573-7497
language	eng
recordid	cdi_proquest_journals_2771495674
source	ABI/INFORM Global; Springer Nature
subjects	Artificial Intelligence Back propagation Bilingualism Coders Computer Science Decoding Dynamic control English language Machine translation Machines Manufacturing Mechanical Engineering Processes Recurrent neural networks Translations
title	Bilingual attention based neural machine translation
url	http://sfxeu10.hosted.exlibrisgroup.com/loughborough?ctx_ver=Z39.88-2004&ctx_enc=info:ofi/enc:UTF-8&ctx_tim=2025-01-06T21%3A16%3A50IST&url_ver=Z39.88-2004&url_ctx_fmt=infofi/fmt:kev:mtx:ctx&rfr_id=info:sid/primo.exlibrisgroup.com:primo3-Article-proquest_cross&rft_val_fmt=info:ofi/fmt:kev:mtx:journal&rft.genre=article&rft.atitle=Bilingual%20attention%20based%20neural%20machine%20translation&rft.jtitle=Applied%20intelligence%20(Dordrecht,%20Netherlands)&rft.au=Kang,%20Liyan&rft.date=2023-02-01&rft.volume=53&rft.issue=4&rft.spage=4302&rft.epage=4315&rft.pages=4302-4315&rft.issn=0924-669X&rft.eissn=1573-7497&rft_id=info:doi/10.1007/s10489-022-03563-8&rft_dat=%3Cproquest_cross%3E2771495674%3C/proquest_cross%3E%3Cgrp_id%3Ecdi_FETCH-LOGICAL-c249t-3a7d6bf49f530373d43f7ac21efb88d338a37405c2f7bf726c774a2d21aa817d3%3C/grp_id%3E%3Coa%3E%3C/oa%3E%3Curl%3E%3C/url%3E&rft_id=info:oai/&rft_pqid=2771495674&rft_id=info:pmid/&rfr_iscdi=true