Loading…

A Bilingual Adversarial Autoencoder for Unsupervised Bilingual Lexicon Induction

Unsupervised bilingual lexicon induction aims to generate bilingual lexicons without any cross-lingual signals. Successfully solving this problem would benefit many downstream tasks, such as unsupervised machine translation and transfer learning. In this work, we propose an unsupervised framework, n...

Full description

Saved in:
Bibliographic Details
Published in:IEEE/ACM transactions on audio, speech, and language processing speech, and language processing, 2019-10, Vol.27 (10), p.1639-1648
Main Authors: Bai, Xuefeng, Cao, Hailong, Chen, Kehai, Zhao, Tiejun
Format: Article
Language:English
Subjects:
Citations: Items that this one cites
Items that cite this one
Online Access:Get full text
Tags: Add Tag
No Tags, Be the first to tag this record!
Description
Summary:Unsupervised bilingual lexicon induction aims to generate bilingual lexicons without any cross-lingual signals. Successfully solving this problem would benefit many downstream tasks, such as unsupervised machine translation and transfer learning. In this work, we propose an unsupervised framework, named bilingual adversarial autoencoder, which automatically generates bilingual lexicon for a pair of languages from their monolingual word embeddings. In contrast to existing frameworks which learn a direct cross-lingual mapping of word embeddings from the source language to the target language, we train two autoencoders jointly to transform the source and the target monolingual word embeddings into a shared embedding space, where a word and its translation are close to each other. In this way, we capture the cross-lingual features of word embeddings from different languages and use them to induce bilingual lexicons. By conducting extensive experiments across eight language pairs, we demonstrate that the proposed method significantly outperforms the existing adversarial methods and even achieves best-published results across most language pairs.
ISSN:2329-9290
2329-9304
DOI:10.1109/TASLP.2019.2925973