Loading…
Bilingual attention based neural machine translation
In recent years, Recurrent Neural Network based Neural Machine Translation (RNN-based NMT) equipped with an attention mechanism from the decoder to encoder, has achieved great advancements and exhibited good performance in many language pairs. However, little work has been done on the attention mech...
Saved in:
Published in: | Applied intelligence (Dordrecht, Netherlands) Netherlands), 2023-02, Vol.53 (4), p.4302-4315 |
---|---|
Main Authors: | , , , , |
Format: | Article |
Language: | English |
Subjects: | |
Citations: | Items that this one cites Items that cite this one |
Online Access: | Get full text |
Tags: |
Add Tag
No Tags, Be the first to tag this record!
|
Summary: | In recent years, Recurrent Neural Network based Neural Machine Translation (RNN-based NMT) equipped with an attention mechanism from the decoder to encoder, has achieved great advancements and exhibited good performance in many language pairs. However, little work has been done on the attention mechanism for the target side, which has the potential to further improve NMT. To address this issue, in this paper, we propose a novel bilingual attention based NMT, where its bilingual attention mechanism exploits decoding history and enables the NMT model to better dynamically select and exploit source side and target side information. Compared with previous RNN-based NMT models, our model has two advantages: First, our model exercises a dynamic control over the ratios at which source and target contexts respectively contribute to the generation of the next target word. In this way, the weakly induced structure relations on both sides can be exploited for NMT. Second, through short-cut connections, the training errors of our model can be directly back-propagated, which effectively alleviates the gradient vanishing or exploding issue. Experimental results and in-depth analyses on Chinese-English, English-German, and English-French translation tasks show that our model with proper configurations can significantly surpass the dominant NMT model, Transformer. Particularly, our proposed model has won the first prize in the English-Chinese translation task of WMT2018. |
---|---|
ISSN: | 0924-669X 1573-7497 |
DOI: | 10.1007/s10489-022-03563-8 |