Loading…
Bilingual attention based neural machine translation
In recent years, Recurrent Neural Network based Neural Machine Translation (RNN-based NMT) equipped with an attention mechanism from the decoder to encoder, has achieved great advancements and exhibited good performance in many language pairs. However, little work has been done on the attention mech...
Saved in:
Published in: | Applied intelligence (Dordrecht, Netherlands) Netherlands), 2023-02, Vol.53 (4), p.4302-4315 |
---|---|
Main Authors: | , , , , |
Format: | Article |
Language: | English |
Subjects: | |
Citations: | Items that this one cites Items that cite this one |
Online Access: | Get full text |
Tags: |
Add Tag
No Tags, Be the first to tag this record!
|
cited_by | cdi_FETCH-LOGICAL-c249t-3a7d6bf49f530373d43f7ac21efb88d338a37405c2f7bf726c774a2d21aa817d3 |
---|---|
cites | cdi_FETCH-LOGICAL-c249t-3a7d6bf49f530373d43f7ac21efb88d338a37405c2f7bf726c774a2d21aa817d3 |
container_end_page | 4315 |
container_issue | 4 |
container_start_page | 4302 |
container_title | Applied intelligence (Dordrecht, Netherlands) |
container_volume | 53 |
creator | Kang, Liyan He, Shaojie Wang, Mingxuan Long, Fei Su, Jinsong |
description | In recent years, Recurrent Neural Network based Neural Machine Translation (RNN-based NMT) equipped with an attention mechanism from the decoder to encoder, has achieved great advancements and exhibited good performance in many language pairs. However, little work has been done on the attention mechanism for the target side, which has the potential to further improve NMT. To address this issue, in this paper, we propose a novel bilingual attention based NMT, where its bilingual attention mechanism exploits decoding history and enables the NMT model to better dynamically select and exploit source side and target side information. Compared with previous RNN-based NMT models, our model has two advantages: First, our model exercises a dynamic control over the ratios at which source and target contexts respectively contribute to the generation of the next target word. In this way, the weakly induced structure relations on both sides can be exploited for NMT. Second, through short-cut connections, the training errors of our model can be directly back-propagated, which effectively alleviates the gradient vanishing or exploding issue. Experimental results and in-depth analyses on Chinese-English, English-German, and English-French translation tasks show that our model with proper configurations can significantly surpass the dominant NMT model, Transformer. Particularly, our proposed model has won the first prize in the English-Chinese translation task of WMT2018. |
doi_str_mv | 10.1007/s10489-022-03563-8 |
format | article |
fullrecord | <record><control><sourceid>proquest_cross</sourceid><recordid>TN_cdi_proquest_journals_2771495674</recordid><sourceformat>XML</sourceformat><sourcesystem>PC</sourcesystem><sourcerecordid>2771495674</sourcerecordid><originalsourceid>FETCH-LOGICAL-c249t-3a7d6bf49f530373d43f7ac21efb88d338a37405c2f7bf726c774a2d21aa817d3</originalsourceid><addsrcrecordid>eNp9kE1LxDAURYMoOI7-AVcF19EkL-1Llzr4BQNuFNyF1zYZO3TSMWkX_ns7VnDn6sHl3PvgMHYpxbUUAm-SFNqUXCjFBeQFcHPEFjJH4KhLPGYLUSrNi6J8P2VnKW2FEABCLpi-a7s2bEbqMhoGF4a2D1lFyTVZcGOc4h3VH21w2RAppI4OwDk78dQld_F7l-zt4f519cTXL4_Pq9s1r5UuBw6ETVF5XfocBCA0GjxSraTzlTENgCFALfJaeaw8qqJG1KQaJYmMxAaW7Gre3cf-c3RpsNt-jGF6aRWi1GVeoJ4oNVN17FOKztt9bHcUv6wU9mDHznbsZMf-2LFmKsFcShMcNi7-Tf_T-gaTR2c0</addsrcrecordid><sourcetype>Aggregation Database</sourcetype><iscdi>true</iscdi><recordtype>article</recordtype><pqid>2771495674</pqid></control><display><type>article</type><title>Bilingual attention based neural machine translation</title><source>ABI/INFORM Global</source><source>Springer Nature</source><creator>Kang, Liyan ; He, Shaojie ; Wang, Mingxuan ; Long, Fei ; Su, Jinsong</creator><creatorcontrib>Kang, Liyan ; He, Shaojie ; Wang, Mingxuan ; Long, Fei ; Su, Jinsong</creatorcontrib><description>In recent years, Recurrent Neural Network based Neural Machine Translation (RNN-based NMT) equipped with an attention mechanism from the decoder to encoder, has achieved great advancements and exhibited good performance in many language pairs. However, little work has been done on the attention mechanism for the target side, which has the potential to further improve NMT. To address this issue, in this paper, we propose a novel bilingual attention based NMT, where its bilingual attention mechanism exploits decoding history and enables the NMT model to better dynamically select and exploit source side and target side information. Compared with previous RNN-based NMT models, our model has two advantages: First, our model exercises a dynamic control over the ratios at which source and target contexts respectively contribute to the generation of the next target word. In this way, the weakly induced structure relations on both sides can be exploited for NMT. Second, through short-cut connections, the training errors of our model can be directly back-propagated, which effectively alleviates the gradient vanishing or exploding issue. Experimental results and in-depth analyses on Chinese-English, English-German, and English-French translation tasks show that our model with proper configurations can significantly surpass the dominant NMT model, Transformer. Particularly, our proposed model has won the first prize in the English-Chinese translation task of WMT2018.</description><identifier>ISSN: 0924-669X</identifier><identifier>EISSN: 1573-7497</identifier><identifier>DOI: 10.1007/s10489-022-03563-8</identifier><language>eng</language><publisher>New York: Springer US</publisher><subject>Artificial Intelligence ; Back propagation ; Bilingualism ; Coders ; Computer Science ; Decoding ; Dynamic control ; English language ; Machine translation ; Machines ; Manufacturing ; Mechanical Engineering ; Processes ; Recurrent neural networks ; Translations</subject><ispartof>Applied intelligence (Dordrecht, Netherlands), 2023-02, Vol.53 (4), p.4302-4315</ispartof><rights>The Author(s), under exclusive licence to Springer Science+Business Media, LLC, part of Springer Nature 2022</rights><rights>The Author(s), under exclusive licence to Springer Science+Business Media, LLC, part of Springer Nature 2022.</rights><lds50>peer_reviewed</lds50><woscitedreferencessubscribed>false</woscitedreferencessubscribed><citedby>FETCH-LOGICAL-c249t-3a7d6bf49f530373d43f7ac21efb88d338a37405c2f7bf726c774a2d21aa817d3</citedby><cites>FETCH-LOGICAL-c249t-3a7d6bf49f530373d43f7ac21efb88d338a37405c2f7bf726c774a2d21aa817d3</cites></display><links><openurl>$$Topenurl_article</openurl><openurlfulltext>$$Topenurlfull_article</openurlfulltext><thumbnail>$$Tsyndetics_thumb_exl</thumbnail><linktopdf>$$Uhttps://www.proquest.com/docview/2771495674/fulltextPDF?pq-origsite=primo$$EPDF$$P50$$Gproquest$$H</linktopdf><linktohtml>$$Uhttps://www.proquest.com/docview/2771495674?pq-origsite=primo$$EHTML$$P50$$Gproquest$$H</linktohtml><link.rule.ids>314,780,784,11688,27924,27925,36060,44363,74895</link.rule.ids></links><search><creatorcontrib>Kang, Liyan</creatorcontrib><creatorcontrib>He, Shaojie</creatorcontrib><creatorcontrib>Wang, Mingxuan</creatorcontrib><creatorcontrib>Long, Fei</creatorcontrib><creatorcontrib>Su, Jinsong</creatorcontrib><title>Bilingual attention based neural machine translation</title><title>Applied intelligence (Dordrecht, Netherlands)</title><addtitle>Appl Intell</addtitle><description>In recent years, Recurrent Neural Network based Neural Machine Translation (RNN-based NMT) equipped with an attention mechanism from the decoder to encoder, has achieved great advancements and exhibited good performance in many language pairs. However, little work has been done on the attention mechanism for the target side, which has the potential to further improve NMT. To address this issue, in this paper, we propose a novel bilingual attention based NMT, where its bilingual attention mechanism exploits decoding history and enables the NMT model to better dynamically select and exploit source side and target side information. Compared with previous RNN-based NMT models, our model has two advantages: First, our model exercises a dynamic control over the ratios at which source and target contexts respectively contribute to the generation of the next target word. In this way, the weakly induced structure relations on both sides can be exploited for NMT. Second, through short-cut connections, the training errors of our model can be directly back-propagated, which effectively alleviates the gradient vanishing or exploding issue. Experimental results and in-depth analyses on Chinese-English, English-German, and English-French translation tasks show that our model with proper configurations can significantly surpass the dominant NMT model, Transformer. Particularly, our proposed model has won the first prize in the English-Chinese translation task of WMT2018.</description><subject>Artificial Intelligence</subject><subject>Back propagation</subject><subject>Bilingualism</subject><subject>Coders</subject><subject>Computer Science</subject><subject>Decoding</subject><subject>Dynamic control</subject><subject>English language</subject><subject>Machine translation</subject><subject>Machines</subject><subject>Manufacturing</subject><subject>Mechanical Engineering</subject><subject>Processes</subject><subject>Recurrent neural networks</subject><subject>Translations</subject><issn>0924-669X</issn><issn>1573-7497</issn><fulltext>true</fulltext><rsrctype>article</rsrctype><creationdate>2023</creationdate><recordtype>article</recordtype><sourceid>M0C</sourceid><recordid>eNp9kE1LxDAURYMoOI7-AVcF19EkL-1Llzr4BQNuFNyF1zYZO3TSMWkX_ns7VnDn6sHl3PvgMHYpxbUUAm-SFNqUXCjFBeQFcHPEFjJH4KhLPGYLUSrNi6J8P2VnKW2FEABCLpi-a7s2bEbqMhoGF4a2D1lFyTVZcGOc4h3VH21w2RAppI4OwDk78dQld_F7l-zt4f519cTXL4_Pq9s1r5UuBw6ETVF5XfocBCA0GjxSraTzlTENgCFALfJaeaw8qqJG1KQaJYmMxAaW7Gre3cf-c3RpsNt-jGF6aRWi1GVeoJ4oNVN17FOKztt9bHcUv6wU9mDHznbsZMf-2LFmKsFcShMcNi7-Tf_T-gaTR2c0</recordid><startdate>20230201</startdate><enddate>20230201</enddate><creator>Kang, Liyan</creator><creator>He, Shaojie</creator><creator>Wang, Mingxuan</creator><creator>Long, Fei</creator><creator>Su, Jinsong</creator><general>Springer US</general><general>Springer Nature B.V</general><scope>AAYXX</scope><scope>CITATION</scope><scope>3V.</scope><scope>7SC</scope><scope>7WY</scope><scope>7WZ</scope><scope>7XB</scope><scope>87Z</scope><scope>8AL</scope><scope>8FD</scope><scope>8FE</scope><scope>8FG</scope><scope>8FK</scope><scope>8FL</scope><scope>ABJCF</scope><scope>ABUWG</scope><scope>AFKRA</scope><scope>ARAPS</scope><scope>AZQEC</scope><scope>BENPR</scope><scope>BEZIV</scope><scope>BGLVJ</scope><scope>CCPQU</scope><scope>DWQXO</scope><scope>FRNLG</scope><scope>F~G</scope><scope>GNUQQ</scope><scope>HCIFZ</scope><scope>JQ2</scope><scope>K60</scope><scope>K6~</scope><scope>K7-</scope><scope>L.-</scope><scope>L6V</scope><scope>L7M</scope><scope>L~C</scope><scope>L~D</scope><scope>M0C</scope><scope>M0N</scope><scope>M7S</scope><scope>P5Z</scope><scope>P62</scope><scope>PQBIZ</scope><scope>PQBZA</scope><scope>PQEST</scope><scope>PQQKQ</scope><scope>PQUKI</scope><scope>PSYQQ</scope><scope>PTHSS</scope><scope>Q9U</scope></search><sort><creationdate>20230201</creationdate><title>Bilingual attention based neural machine translation</title><author>Kang, Liyan ; He, Shaojie ; Wang, Mingxuan ; Long, Fei ; Su, Jinsong</author></sort><facets><frbrtype>5</frbrtype><frbrgroupid>cdi_FETCH-LOGICAL-c249t-3a7d6bf49f530373d43f7ac21efb88d338a37405c2f7bf726c774a2d21aa817d3</frbrgroupid><rsrctype>articles</rsrctype><prefilter>articles</prefilter><language>eng</language><creationdate>2023</creationdate><topic>Artificial Intelligence</topic><topic>Back propagation</topic><topic>Bilingualism</topic><topic>Coders</topic><topic>Computer Science</topic><topic>Decoding</topic><topic>Dynamic control</topic><topic>English language</topic><topic>Machine translation</topic><topic>Machines</topic><topic>Manufacturing</topic><topic>Mechanical Engineering</topic><topic>Processes</topic><topic>Recurrent neural networks</topic><topic>Translations</topic><toplevel>peer_reviewed</toplevel><toplevel>online_resources</toplevel><creatorcontrib>Kang, Liyan</creatorcontrib><creatorcontrib>He, Shaojie</creatorcontrib><creatorcontrib>Wang, Mingxuan</creatorcontrib><creatorcontrib>Long, Fei</creatorcontrib><creatorcontrib>Su, Jinsong</creatorcontrib><collection>CrossRef</collection><collection>ProQuest Central (Corporate)</collection><collection>Computer and Information Systems Abstracts</collection><collection>ABI/INFORM Collection</collection><collection>ABI/INFORM Global (PDF only)</collection><collection>ProQuest Central (purchase pre-March 2016)</collection><collection>ABI/INFORM Collection</collection><collection>Computing Database (Alumni Edition)</collection><collection>Technology Research Database</collection><collection>ProQuest SciTech Collection</collection><collection>ProQuest Technology Collection</collection><collection>ProQuest Central (Alumni) (purchase pre-March 2016)</collection><collection>ABI/INFORM Collection (Alumni Edition)</collection><collection>Materials Science & Engineering Collection</collection><collection>ProQuest Central (Alumni)</collection><collection>ProQuest Central</collection><collection>Advanced Technologies & Aerospace Collection</collection><collection>ProQuest Central Essentials</collection><collection>ProQuest Central</collection><collection>Business Premium Collection</collection><collection>Technology Collection</collection><collection>ProQuest One Community College</collection><collection>ProQuest Central</collection><collection>Business Premium Collection (Alumni)</collection><collection>ABI/INFORM Global (Corporate)</collection><collection>ProQuest Central Student</collection><collection>SciTech Premium Collection</collection><collection>ProQuest Computer Science Collection</collection><collection>ProQuest Business Collection (Alumni Edition)</collection><collection>ProQuest Business Collection</collection><collection>Computer Science Database</collection><collection>ABI/INFORM Professional Advanced</collection><collection>ProQuest Engineering Collection</collection><collection>Advanced Technologies Database with Aerospace</collection><collection>Computer and Information Systems Abstracts – Academic</collection><collection>Computer and Information Systems Abstracts Professional</collection><collection>ABI/INFORM Global</collection><collection>Computing Database</collection><collection>Engineering Database</collection><collection>Advanced Technologies & Aerospace Database</collection><collection>ProQuest Advanced Technologies & Aerospace Collection</collection><collection>One Business</collection><collection>ProQuest One Business (Alumni)</collection><collection>ProQuest One Academic Eastern Edition (DO NOT USE)</collection><collection>ProQuest One Academic</collection><collection>ProQuest One Academic UKI Edition</collection><collection>ProQuest One Psychology</collection><collection>Engineering Collection</collection><collection>ProQuest Central Basic</collection><jtitle>Applied intelligence (Dordrecht, Netherlands)</jtitle></facets><delivery><delcategory>Remote Search Resource</delcategory><fulltext>fulltext</fulltext></delivery><addata><au>Kang, Liyan</au><au>He, Shaojie</au><au>Wang, Mingxuan</au><au>Long, Fei</au><au>Su, Jinsong</au><format>journal</format><genre>article</genre><ristype>JOUR</ristype><atitle>Bilingual attention based neural machine translation</atitle><jtitle>Applied intelligence (Dordrecht, Netherlands)</jtitle><stitle>Appl Intell</stitle><date>2023-02-01</date><risdate>2023</risdate><volume>53</volume><issue>4</issue><spage>4302</spage><epage>4315</epage><pages>4302-4315</pages><issn>0924-669X</issn><eissn>1573-7497</eissn><abstract>In recent years, Recurrent Neural Network based Neural Machine Translation (RNN-based NMT) equipped with an attention mechanism from the decoder to encoder, has achieved great advancements and exhibited good performance in many language pairs. However, little work has been done on the attention mechanism for the target side, which has the potential to further improve NMT. To address this issue, in this paper, we propose a novel bilingual attention based NMT, where its bilingual attention mechanism exploits decoding history and enables the NMT model to better dynamically select and exploit source side and target side information. Compared with previous RNN-based NMT models, our model has two advantages: First, our model exercises a dynamic control over the ratios at which source and target contexts respectively contribute to the generation of the next target word. In this way, the weakly induced structure relations on both sides can be exploited for NMT. Second, through short-cut connections, the training errors of our model can be directly back-propagated, which effectively alleviates the gradient vanishing or exploding issue. Experimental results and in-depth analyses on Chinese-English, English-German, and English-French translation tasks show that our model with proper configurations can significantly surpass the dominant NMT model, Transformer. Particularly, our proposed model has won the first prize in the English-Chinese translation task of WMT2018.</abstract><cop>New York</cop><pub>Springer US</pub><doi>10.1007/s10489-022-03563-8</doi><tpages>14</tpages></addata></record> |
fulltext | fulltext |
identifier | ISSN: 0924-669X |
ispartof | Applied intelligence (Dordrecht, Netherlands), 2023-02, Vol.53 (4), p.4302-4315 |
issn | 0924-669X 1573-7497 |
language | eng |
recordid | cdi_proquest_journals_2771495674 |
source | ABI/INFORM Global; Springer Nature |
subjects | Artificial Intelligence Back propagation Bilingualism Coders Computer Science Decoding Dynamic control English language Machine translation Machines Manufacturing Mechanical Engineering Processes Recurrent neural networks Translations |
title | Bilingual attention based neural machine translation |
url | http://sfxeu10.hosted.exlibrisgroup.com/loughborough?ctx_ver=Z39.88-2004&ctx_enc=info:ofi/enc:UTF-8&ctx_tim=2025-01-06T21%3A16%3A50IST&url_ver=Z39.88-2004&url_ctx_fmt=infofi/fmt:kev:mtx:ctx&rfr_id=info:sid/primo.exlibrisgroup.com:primo3-Article-proquest_cross&rft_val_fmt=info:ofi/fmt:kev:mtx:journal&rft.genre=article&rft.atitle=Bilingual%20attention%20based%20neural%20machine%20translation&rft.jtitle=Applied%20intelligence%20(Dordrecht,%20Netherlands)&rft.au=Kang,%20Liyan&rft.date=2023-02-01&rft.volume=53&rft.issue=4&rft.spage=4302&rft.epage=4315&rft.pages=4302-4315&rft.issn=0924-669X&rft.eissn=1573-7497&rft_id=info:doi/10.1007/s10489-022-03563-8&rft_dat=%3Cproquest_cross%3E2771495674%3C/proquest_cross%3E%3Cgrp_id%3Ecdi_FETCH-LOGICAL-c249t-3a7d6bf49f530373d43f7ac21efb88d338a37405c2f7bf726c774a2d21aa817d3%3C/grp_id%3E%3Coa%3E%3C/oa%3E%3Curl%3E%3C/url%3E&rft_id=info:oai/&rft_pqid=2771495674&rft_id=info:pmid/&rfr_iscdi=true |