Loading…

Learning Neural Trans-Dimensional Random Field Language Models with Noise-Contrastive Estimation

Trans-dimensional random field language models (TRF LMs) where sentences are modeled as a collection of random fields, have shown close performance with LSTM LMs in speech recognition and are computationally more efficient in inference. However, the training efficiency of neural TRF LMs is not satis...

Full description

Saved in:

Bibliographic Details
Main Authors:	Wang, Bin, Ou, Zhijian
Format:	Conference Proceeding
Language:	English
Subjects:	Computational modeling Estimation Feature extraction Language Model Morphology Neural networks Noise-contrastive Estimation Random Field Scalability Speech Recognition Training
Online Access:	Request full text
Tags:	Add Tag No Tags, Be the first to tag this record!

Description
Summary:	Trans-dimensional random field language models (TRF LMs) where sentences are modeled as a collection of random fields, have shown close performance with LSTM LMs in speech recognition and are computationally more efficient in inference. However, the training efficiency of neural TRF LMs is not satisfactory, which limits the scalability of TRF LMs on large training corpus. In this paper, several techniques on both model formulation and parameter estimation are proposed to improve the training efficiency and the performance of neural TRF LMs. First, TRFs are reformulated in the form of exponential tilting of a reference distribution. Second, noise-contrastive estimation (NCE) is introduced to jointly estimate the model parameters and normalization constants. Third, we extend the neural TRF LMs by marrying the deep convolutional neural network (CNN) and the bidirectional LSTM into the potential function to extract the deep hierarchical features and bidirectionally sequential features. Utilizing all the above techniques enables the successful and efficient training of neural TRF LMs on a 40x larger training set with only 1/3 training time and further reduces the WER with relative reduction of 4.7% on top of a strong LSTM LM baseline.
ISSN:	2379-190X
DOI:	10.1109/ICASSP.2018.8461813