Loading…

Open and Competitive Multilingual Neural Machine Translation in Production

This report presents the results of an Estonian governmental project, which aimed to create open-source machine translation systems for Estonian. The project's goal included six translation directions translating between Estonian and English, German and Russian, and five text domains - general...

Full description

Saved in:
Bibliographic Details
Published in:Baltic Journal of Modern Computing 2022, Vol.10 (3), p.422-434
Main Authors: Tättar, Andre, Purason, Taido, Kuulmets, Hele-Andra, Luhtaru, Agnes, Rätsep, Liisa, Tars, Maali, Pinnis, Mārcis, Bergmanis, Toms, Fishel, Mark
Format: Article
Language:English
Subjects:
Online Access:Get full text
Tags: Add Tag
No Tags, Be the first to tag this record!
Description
Summary:This report presents the results of an Estonian governmental project, which aimed to create open-source machine translation systems for Estonian. The project's goal included six translation directions translating between Estonian and English, German and Russian, and five text domains - general domain, spoken language, legal, military and crisis texts. The project results include 1) openly distributed parallel and monolingual corpora for the relevant languages, 2) open-source neural machine translation systems trained on them, and 3) new public evaluation benchmarks. The automatic evaluation shows that the resulting systems are highly competitive and match or surpass the performance of other available online machine translation systems. We present the recipe for training such systems and other details and discuss interesting findings made in the process. A live translation demo of the resulting systems (also open-sourced) is available online.
ISSN:2255-8950
2255-8942
2255-8950
DOI:10.22364/bjmc.2022.10.3.15