Loading…
Open and Competitive Multilingual Neural Machine Translation in Production
This report presents the results of an Estonian governmental project, which aimed to create open-source machine translation systems for Estonian. The project's goal included six translation directions translating between Estonian and English, German and Russian, and five text domains - general...
Saved in:
Published in: | Baltic Journal of Modern Computing 2022, Vol.10 (3), p.422-434 |
---|---|
Main Authors: | , , , , , , , , |
Format: | Article |
Language: | English |
Subjects: | |
Online Access: | Get full text |
Tags: |
Add Tag
No Tags, Be the first to tag this record!
|
Summary: | This report presents the results of an Estonian governmental project, which aimed to create open-source machine translation systems for Estonian. The project's goal included six translation directions translating between Estonian and English, German and Russian, and five text domains - general domain, spoken language, legal, military and crisis texts. The project results include 1) openly distributed parallel and monolingual corpora for the relevant languages, 2) open-source neural machine translation systems trained on them, and 3) new public evaluation benchmarks. The automatic evaluation shows that the resulting systems are highly competitive and match or surpass the performance of other available online machine translation systems. We present the recipe for training such systems and other details and discuss interesting findings made in the process. A live translation demo of the resulting systems (also open-sourced) is available online. |
---|---|
ISSN: | 2255-8950 2255-8942 2255-8950 |
DOI: | 10.22364/bjmc.2022.10.3.15 |