Loading…
Exploring Discriminative Word-Level Domain Contexts for Multi-Domain Neural Machine Translation
Owing to its practical significance, multi-domain Neural Machine Translation (NMT) has attracted much attention recently. Recent studies mainly focus on constructing a unified NMT model with mixed-domain training corpora to switch translation between different domains. In these models, the words in...
Saved in:
Published in: | IEEE transactions on pattern analysis and machine intelligence 2021-05, Vol.43 (5), p.1530-1545 |
---|---|
Main Authors: | , , , , , |
Format: | Article |
Language: | English |
Subjects: | |
Citations: | Items that this one cites Items that cite this one |
Online Access: | Get full text |
Tags: |
Add Tag
No Tags, Be the first to tag this record!
|
Summary: | Owing to its practical significance, multi-domain Neural Machine Translation (NMT) has attracted much attention recently. Recent studies mainly focus on constructing a unified NMT model with mixed-domain training corpora to switch translation between different domains. In these models, the words in the same sentence are not well distinguished, while intuitively, they are related to the sentence domain to varying degrees and thus should exert different effects on the multi-domain NMT model. In this article, we are committed to distinguishing and exploiting different word-level domain contexts for multi-domain NMT. For this purpose, we adopt multi-task learning to jointly model NMT and monolingual attention-based domain classification tasks, improving the NMT model in two ways: 1) One domain classifier and one adversarial domain classifier are introduced to conduct domain classifications of input sentences. During this process, two generated gating vectors are used to produce domain-specific and domain-shared annotations for decoder; 2) We equip decoder with an attentional domain classifier. Then, the derived attentional weights are utilized to refine the model training via word-level cost weighting, so that the impacts of target words can be discriminated by their relevance to sentence domain. Experimental results on several multi-domain translations demonstrate the effectiveness of our model. |
---|---|
ISSN: | 0162-8828 1939-3539 2160-9292 |
DOI: | 10.1109/TPAMI.2019.2954406 |