Loading…

Bilingual attention based neural machine translation

In recent years, Recurrent Neural Network based Neural Machine Translation (RNN-based NMT) equipped with an attention mechanism from the decoder to encoder, has achieved great advancements and exhibited good performance in many language pairs. However, little work has been done on the attention mech...

Full description

Saved in:
Bibliographic Details
Published in:Applied intelligence (Dordrecht, Netherlands) Netherlands), 2023-02, Vol.53 (4), p.4302-4315
Main Authors: Kang, Liyan, He, Shaojie, Wang, Mingxuan, Long, Fei, Su, Jinsong
Format: Article
Language:English
Subjects:
Citations: Items that this one cites
Items that cite this one
Online Access:Get full text
Tags: Add Tag
No Tags, Be the first to tag this record!
cited_by cdi_FETCH-LOGICAL-c249t-3a7d6bf49f530373d43f7ac21efb88d338a37405c2f7bf726c774a2d21aa817d3
cites cdi_FETCH-LOGICAL-c249t-3a7d6bf49f530373d43f7ac21efb88d338a37405c2f7bf726c774a2d21aa817d3
container_end_page 4315
container_issue 4
container_start_page 4302
container_title Applied intelligence (Dordrecht, Netherlands)
container_volume 53
creator Kang, Liyan
He, Shaojie
Wang, Mingxuan
Long, Fei
Su, Jinsong
description In recent years, Recurrent Neural Network based Neural Machine Translation (RNN-based NMT) equipped with an attention mechanism from the decoder to encoder, has achieved great advancements and exhibited good performance in many language pairs. However, little work has been done on the attention mechanism for the target side, which has the potential to further improve NMT. To address this issue, in this paper, we propose a novel bilingual attention based NMT, where its bilingual attention mechanism exploits decoding history and enables the NMT model to better dynamically select and exploit source side and target side information. Compared with previous RNN-based NMT models, our model has two advantages: First, our model exercises a dynamic control over the ratios at which source and target contexts respectively contribute to the generation of the next target word. In this way, the weakly induced structure relations on both sides can be exploited for NMT. Second, through short-cut connections, the training errors of our model can be directly back-propagated, which effectively alleviates the gradient vanishing or exploding issue. Experimental results and in-depth analyses on Chinese-English, English-German, and English-French translation tasks show that our model with proper configurations can significantly surpass the dominant NMT model, Transformer. Particularly, our proposed model has won the first prize in the English-Chinese translation task of WMT2018.
doi_str_mv 10.1007/s10489-022-03563-8
format article
fullrecord <record><control><sourceid>proquest_cross</sourceid><recordid>TN_cdi_proquest_journals_2771495674</recordid><sourceformat>XML</sourceformat><sourcesystem>PC</sourcesystem><sourcerecordid>2771495674</sourcerecordid><originalsourceid>FETCH-LOGICAL-c249t-3a7d6bf49f530373d43f7ac21efb88d338a37405c2f7bf726c774a2d21aa817d3</originalsourceid><addsrcrecordid>eNp9kE1LxDAURYMoOI7-AVcF19EkL-1Llzr4BQNuFNyF1zYZO3TSMWkX_ns7VnDn6sHl3PvgMHYpxbUUAm-SFNqUXCjFBeQFcHPEFjJH4KhLPGYLUSrNi6J8P2VnKW2FEABCLpi-a7s2bEbqMhoGF4a2D1lFyTVZcGOc4h3VH21w2RAppI4OwDk78dQld_F7l-zt4f519cTXL4_Pq9s1r5UuBw6ETVF5XfocBCA0GjxSraTzlTENgCFALfJaeaw8qqJG1KQaJYmMxAaW7Gre3cf-c3RpsNt-jGF6aRWi1GVeoJ4oNVN17FOKztt9bHcUv6wU9mDHznbsZMf-2LFmKsFcShMcNi7-Tf_T-gaTR2c0</addsrcrecordid><sourcetype>Aggregation Database</sourcetype><iscdi>true</iscdi><recordtype>article</recordtype><pqid>2771495674</pqid></control><display><type>article</type><title>Bilingual attention based neural machine translation</title><source>ABI/INFORM Global</source><source>Springer Nature</source><creator>Kang, Liyan ; He, Shaojie ; Wang, Mingxuan ; Long, Fei ; Su, Jinsong</creator><creatorcontrib>Kang, Liyan ; He, Shaojie ; Wang, Mingxuan ; Long, Fei ; Su, Jinsong</creatorcontrib><description>In recent years, Recurrent Neural Network based Neural Machine Translation (RNN-based NMT) equipped with an attention mechanism from the decoder to encoder, has achieved great advancements and exhibited good performance in many language pairs. However, little work has been done on the attention mechanism for the target side, which has the potential to further improve NMT. To address this issue, in this paper, we propose a novel bilingual attention based NMT, where its bilingual attention mechanism exploits decoding history and enables the NMT model to better dynamically select and exploit source side and target side information. Compared with previous RNN-based NMT models, our model has two advantages: First, our model exercises a dynamic control over the ratios at which source and target contexts respectively contribute to the generation of the next target word. In this way, the weakly induced structure relations on both sides can be exploited for NMT. Second, through short-cut connections, the training errors of our model can be directly back-propagated, which effectively alleviates the gradient vanishing or exploding issue. Experimental results and in-depth analyses on Chinese-English, English-German, and English-French translation tasks show that our model with proper configurations can significantly surpass the dominant NMT model, Transformer. Particularly, our proposed model has won the first prize in the English-Chinese translation task of WMT2018.</description><identifier>ISSN: 0924-669X</identifier><identifier>EISSN: 1573-7497</identifier><identifier>DOI: 10.1007/s10489-022-03563-8</identifier><language>eng</language><publisher>New York: Springer US</publisher><subject>Artificial Intelligence ; Back propagation ; Bilingualism ; Coders ; Computer Science ; Decoding ; Dynamic control ; English language ; Machine translation ; Machines ; Manufacturing ; Mechanical Engineering ; Processes ; Recurrent neural networks ; Translations</subject><ispartof>Applied intelligence (Dordrecht, Netherlands), 2023-02, Vol.53 (4), p.4302-4315</ispartof><rights>The Author(s), under exclusive licence to Springer Science+Business Media, LLC, part of Springer Nature 2022</rights><rights>The Author(s), under exclusive licence to Springer Science+Business Media, LLC, part of Springer Nature 2022.</rights><lds50>peer_reviewed</lds50><woscitedreferencessubscribed>false</woscitedreferencessubscribed><citedby>FETCH-LOGICAL-c249t-3a7d6bf49f530373d43f7ac21efb88d338a37405c2f7bf726c774a2d21aa817d3</citedby><cites>FETCH-LOGICAL-c249t-3a7d6bf49f530373d43f7ac21efb88d338a37405c2f7bf726c774a2d21aa817d3</cites></display><links><openurl>$$Topenurl_article</openurl><openurlfulltext>$$Topenurlfull_article</openurlfulltext><thumbnail>$$Tsyndetics_thumb_exl</thumbnail><linktopdf>$$Uhttps://www.proquest.com/docview/2771495674/fulltextPDF?pq-origsite=primo$$EPDF$$P50$$Gproquest$$H</linktopdf><linktohtml>$$Uhttps://www.proquest.com/docview/2771495674?pq-origsite=primo$$EHTML$$P50$$Gproquest$$H</linktohtml><link.rule.ids>314,780,784,11688,27924,27925,36060,44363,74895</link.rule.ids></links><search><creatorcontrib>Kang, Liyan</creatorcontrib><creatorcontrib>He, Shaojie</creatorcontrib><creatorcontrib>Wang, Mingxuan</creatorcontrib><creatorcontrib>Long, Fei</creatorcontrib><creatorcontrib>Su, Jinsong</creatorcontrib><title>Bilingual attention based neural machine translation</title><title>Applied intelligence (Dordrecht, Netherlands)</title><addtitle>Appl Intell</addtitle><description>In recent years, Recurrent Neural Network based Neural Machine Translation (RNN-based NMT) equipped with an attention mechanism from the decoder to encoder, has achieved great advancements and exhibited good performance in many language pairs. However, little work has been done on the attention mechanism for the target side, which has the potential to further improve NMT. To address this issue, in this paper, we propose a novel bilingual attention based NMT, where its bilingual attention mechanism exploits decoding history and enables the NMT model to better dynamically select and exploit source side and target side information. Compared with previous RNN-based NMT models, our model has two advantages: First, our model exercises a dynamic control over the ratios at which source and target contexts respectively contribute to the generation of the next target word. In this way, the weakly induced structure relations on both sides can be exploited for NMT. Second, through short-cut connections, the training errors of our model can be directly back-propagated, which effectively alleviates the gradient vanishing or exploding issue. Experimental results and in-depth analyses on Chinese-English, English-German, and English-French translation tasks show that our model with proper configurations can significantly surpass the dominant NMT model, Transformer. Particularly, our proposed model has won the first prize in the English-Chinese translation task of WMT2018.</description><subject>Artificial Intelligence</subject><subject>Back propagation</subject><subject>Bilingualism</subject><subject>Coders</subject><subject>Computer Science</subject><subject>Decoding</subject><subject>Dynamic control</subject><subject>English language</subject><subject>Machine translation</subject><subject>Machines</subject><subject>Manufacturing</subject><subject>Mechanical Engineering</subject><subject>Processes</subject><subject>Recurrent neural networks</subject><subject>Translations</subject><issn>0924-669X</issn><issn>1573-7497</issn><fulltext>true</fulltext><rsrctype>article</rsrctype><creationdate>2023</creationdate><recordtype>article</recordtype><sourceid>M0C</sourceid><recordid>eNp9kE1LxDAURYMoOI7-AVcF19EkL-1Llzr4BQNuFNyF1zYZO3TSMWkX_ns7VnDn6sHl3PvgMHYpxbUUAm-SFNqUXCjFBeQFcHPEFjJH4KhLPGYLUSrNi6J8P2VnKW2FEABCLpi-a7s2bEbqMhoGF4a2D1lFyTVZcGOc4h3VH21w2RAppI4OwDk78dQld_F7l-zt4f519cTXL4_Pq9s1r5UuBw6ETVF5XfocBCA0GjxSraTzlTENgCFALfJaeaw8qqJG1KQaJYmMxAaW7Gre3cf-c3RpsNt-jGF6aRWi1GVeoJ4oNVN17FOKztt9bHcUv6wU9mDHznbsZMf-2LFmKsFcShMcNi7-Tf_T-gaTR2c0</recordid><startdate>20230201</startdate><enddate>20230201</enddate><creator>Kang, Liyan</creator><creator>He, Shaojie</creator><creator>Wang, Mingxuan</creator><creator>Long, Fei</creator><creator>Su, Jinsong</creator><general>Springer US</general><general>Springer Nature B.V</general><scope>AAYXX</scope><scope>CITATION</scope><scope>3V.</scope><scope>7SC</scope><scope>7WY</scope><scope>7WZ</scope><scope>7XB</scope><scope>87Z</scope><scope>8AL</scope><scope>8FD</scope><scope>8FE</scope><scope>8FG</scope><scope>8FK</scope><scope>8FL</scope><scope>ABJCF</scope><scope>ABUWG</scope><scope>AFKRA</scope><scope>ARAPS</scope><scope>AZQEC</scope><scope>BENPR</scope><scope>BEZIV</scope><scope>BGLVJ</scope><scope>CCPQU</scope><scope>DWQXO</scope><scope>FRNLG</scope><scope>F~G</scope><scope>GNUQQ</scope><scope>HCIFZ</scope><scope>JQ2</scope><scope>K60</scope><scope>K6~</scope><scope>K7-</scope><scope>L.-</scope><scope>L6V</scope><scope>L7M</scope><scope>L~C</scope><scope>L~D</scope><scope>M0C</scope><scope>M0N</scope><scope>M7S</scope><scope>P5Z</scope><scope>P62</scope><scope>PQBIZ</scope><scope>PQBZA</scope><scope>PQEST</scope><scope>PQQKQ</scope><scope>PQUKI</scope><scope>PSYQQ</scope><scope>PTHSS</scope><scope>Q9U</scope></search><sort><creationdate>20230201</creationdate><title>Bilingual attention based neural machine translation</title><author>Kang, Liyan ; He, Shaojie ; Wang, Mingxuan ; Long, Fei ; Su, Jinsong</author></sort><facets><frbrtype>5</frbrtype><frbrgroupid>cdi_FETCH-LOGICAL-c249t-3a7d6bf49f530373d43f7ac21efb88d338a37405c2f7bf726c774a2d21aa817d3</frbrgroupid><rsrctype>articles</rsrctype><prefilter>articles</prefilter><language>eng</language><creationdate>2023</creationdate><topic>Artificial Intelligence</topic><topic>Back propagation</topic><topic>Bilingualism</topic><topic>Coders</topic><topic>Computer Science</topic><topic>Decoding</topic><topic>Dynamic control</topic><topic>English language</topic><topic>Machine translation</topic><topic>Machines</topic><topic>Manufacturing</topic><topic>Mechanical Engineering</topic><topic>Processes</topic><topic>Recurrent neural networks</topic><topic>Translations</topic><toplevel>peer_reviewed</toplevel><toplevel>online_resources</toplevel><creatorcontrib>Kang, Liyan</creatorcontrib><creatorcontrib>He, Shaojie</creatorcontrib><creatorcontrib>Wang, Mingxuan</creatorcontrib><creatorcontrib>Long, Fei</creatorcontrib><creatorcontrib>Su, Jinsong</creatorcontrib><collection>CrossRef</collection><collection>ProQuest Central (Corporate)</collection><collection>Computer and Information Systems Abstracts</collection><collection>ABI/INFORM Collection</collection><collection>ABI/INFORM Global (PDF only)</collection><collection>ProQuest Central (purchase pre-March 2016)</collection><collection>ABI/INFORM Collection</collection><collection>Computing Database (Alumni Edition)</collection><collection>Technology Research Database</collection><collection>ProQuest SciTech Collection</collection><collection>ProQuest Technology Collection</collection><collection>ProQuest Central (Alumni) (purchase pre-March 2016)</collection><collection>ABI/INFORM Collection (Alumni Edition)</collection><collection>Materials Science &amp; Engineering Collection</collection><collection>ProQuest Central (Alumni)</collection><collection>ProQuest Central</collection><collection>Advanced Technologies &amp; Aerospace Collection</collection><collection>ProQuest Central Essentials</collection><collection>ProQuest Central</collection><collection>Business Premium Collection</collection><collection>Technology Collection</collection><collection>ProQuest One Community College</collection><collection>ProQuest Central</collection><collection>Business Premium Collection (Alumni)</collection><collection>ABI/INFORM Global (Corporate)</collection><collection>ProQuest Central Student</collection><collection>SciTech Premium Collection</collection><collection>ProQuest Computer Science Collection</collection><collection>ProQuest Business Collection (Alumni Edition)</collection><collection>ProQuest Business Collection</collection><collection>Computer Science Database</collection><collection>ABI/INFORM Professional Advanced</collection><collection>ProQuest Engineering Collection</collection><collection>Advanced Technologies Database with Aerospace</collection><collection>Computer and Information Systems Abstracts – Academic</collection><collection>Computer and Information Systems Abstracts Professional</collection><collection>ABI/INFORM Global</collection><collection>Computing Database</collection><collection>Engineering Database</collection><collection>Advanced Technologies &amp; Aerospace Database</collection><collection>ProQuest Advanced Technologies &amp; Aerospace Collection</collection><collection>One Business</collection><collection>ProQuest One Business (Alumni)</collection><collection>ProQuest One Academic Eastern Edition (DO NOT USE)</collection><collection>ProQuest One Academic</collection><collection>ProQuest One Academic UKI Edition</collection><collection>ProQuest One Psychology</collection><collection>Engineering Collection</collection><collection>ProQuest Central Basic</collection><jtitle>Applied intelligence (Dordrecht, Netherlands)</jtitle></facets><delivery><delcategory>Remote Search Resource</delcategory><fulltext>fulltext</fulltext></delivery><addata><au>Kang, Liyan</au><au>He, Shaojie</au><au>Wang, Mingxuan</au><au>Long, Fei</au><au>Su, Jinsong</au><format>journal</format><genre>article</genre><ristype>JOUR</ristype><atitle>Bilingual attention based neural machine translation</atitle><jtitle>Applied intelligence (Dordrecht, Netherlands)</jtitle><stitle>Appl Intell</stitle><date>2023-02-01</date><risdate>2023</risdate><volume>53</volume><issue>4</issue><spage>4302</spage><epage>4315</epage><pages>4302-4315</pages><issn>0924-669X</issn><eissn>1573-7497</eissn><abstract>In recent years, Recurrent Neural Network based Neural Machine Translation (RNN-based NMT) equipped with an attention mechanism from the decoder to encoder, has achieved great advancements and exhibited good performance in many language pairs. However, little work has been done on the attention mechanism for the target side, which has the potential to further improve NMT. To address this issue, in this paper, we propose a novel bilingual attention based NMT, where its bilingual attention mechanism exploits decoding history and enables the NMT model to better dynamically select and exploit source side and target side information. Compared with previous RNN-based NMT models, our model has two advantages: First, our model exercises a dynamic control over the ratios at which source and target contexts respectively contribute to the generation of the next target word. In this way, the weakly induced structure relations on both sides can be exploited for NMT. Second, through short-cut connections, the training errors of our model can be directly back-propagated, which effectively alleviates the gradient vanishing or exploding issue. Experimental results and in-depth analyses on Chinese-English, English-German, and English-French translation tasks show that our model with proper configurations can significantly surpass the dominant NMT model, Transformer. Particularly, our proposed model has won the first prize in the English-Chinese translation task of WMT2018.</abstract><cop>New York</cop><pub>Springer US</pub><doi>10.1007/s10489-022-03563-8</doi><tpages>14</tpages></addata></record>
fulltext fulltext
identifier ISSN: 0924-669X
ispartof Applied intelligence (Dordrecht, Netherlands), 2023-02, Vol.53 (4), p.4302-4315
issn 0924-669X
1573-7497
language eng
recordid cdi_proquest_journals_2771495674
source ABI/INFORM Global; Springer Nature
subjects Artificial Intelligence
Back propagation
Bilingualism
Coders
Computer Science
Decoding
Dynamic control
English language
Machine translation
Machines
Manufacturing
Mechanical Engineering
Processes
Recurrent neural networks
Translations
title Bilingual attention based neural machine translation
url http://sfxeu10.hosted.exlibrisgroup.com/loughborough?ctx_ver=Z39.88-2004&ctx_enc=info:ofi/enc:UTF-8&ctx_tim=2025-01-06T21%3A16%3A50IST&url_ver=Z39.88-2004&url_ctx_fmt=infofi/fmt:kev:mtx:ctx&rfr_id=info:sid/primo.exlibrisgroup.com:primo3-Article-proquest_cross&rft_val_fmt=info:ofi/fmt:kev:mtx:journal&rft.genre=article&rft.atitle=Bilingual%20attention%20based%20neural%20machine%20translation&rft.jtitle=Applied%20intelligence%20(Dordrecht,%20Netherlands)&rft.au=Kang,%20Liyan&rft.date=2023-02-01&rft.volume=53&rft.issue=4&rft.spage=4302&rft.epage=4315&rft.pages=4302-4315&rft.issn=0924-669X&rft.eissn=1573-7497&rft_id=info:doi/10.1007/s10489-022-03563-8&rft_dat=%3Cproquest_cross%3E2771495674%3C/proquest_cross%3E%3Cgrp_id%3Ecdi_FETCH-LOGICAL-c249t-3a7d6bf49f530373d43f7ac21efb88d338a37405c2f7bf726c774a2d21aa817d3%3C/grp_id%3E%3Coa%3E%3C/oa%3E%3Curl%3E%3C/url%3E&rft_id=info:oai/&rft_pqid=2771495674&rft_id=info:pmid/&rfr_iscdi=true