Loading…

Language mapping functions: Improving softmax estimation and word embedding quality

Summary One of the best methods for estimating the softmax layer in neural network language models is the noise‐contrastive estimation (NCE) method. However, this method is not proper for word embedding applications compared with some other robust methods such as the negative sampling (NEG) method....

Full description

Saved in:
Bibliographic Details
Published in:Concurrency and computation 2021-12, Vol.33 (24), p.n/a
Main Authors: Rangriz, Emad, Pourahmadi, Vahid
Format: Article
Language:English
Subjects:
Citations: Items that this one cites
Online Access:Get full text
Tags: Add Tag
No Tags, Be the first to tag this record!
Description
Summary:Summary One of the best methods for estimating the softmax layer in neural network language models is the noise‐contrastive estimation (NCE) method. However, this method is not proper for word embedding applications compared with some other robust methods such as the negative sampling (NEG) method. The NEG method implements the pointwise mutual information (PMI) relation between the word‐context space in the neural network, and the NCE method implements conditional probability. Both the NCE and NEG methods use dot‐product‐based mapping to map words and contexts vector to the probabilities. This article presents the parametric objective function, which uses the mapping function as the parameter. Also, we obtained a parametric relation between word‐context space according to the mapping parameter. Using the parametric objective function, we identify conditions for a mapping that make it a proper selection for both softmax estimation and word embedding. The article also presents two specific mapping functions with the required conditions, and we compared their performance with that of the dot‐product mapping function. The performance of the new mapping functions is also reported over common word embedding and language models' benchmarks.
ISSN:1532-0626
1532-0634
DOI:10.1002/cpe.6464