Loading…

Large-scale continuous speech recognition system design using discriminative training

Discriminative training is difficult to implement but essential to attaining state-of-the-art performance in automatic speech recognition systems today. Most of the discriminative training results for large scale recognition tasks (with vocabularies well over 10 000 words) so far use the maximum mut...

Full description

Saved in:
Bibliographic Details
Published in:The Journal of the Acoustical Society of America 2006-11, Vol.120 (5_Supplement), p.3042-3042
Main Authors: McDermott, Erik, Nakamura, Atsushi
Format: Article
Language:English
Online Access:Get full text
Tags: Add Tag
No Tags, Be the first to tag this record!
Description
Summary:Discriminative training is difficult to implement but essential to attaining state-of-the-art performance in automatic speech recognition systems today. Most of the discriminative training results for large scale recognition tasks (with vocabularies well over 10 000 words) so far use the maximum mutual information (MMI) framework, but recent results for the minimum classification error (MCE) framework suggest that MCE too yields significant improvements in recognition accuracy and system compactness on large-scale tasks. MCE embodies rather well the general intuition that recognition system design should attempt to improve performance (i.e., recognition accuracy) directly, by optimizing a criterion function that is closely related to performance, rather than indirectly, by optimizing a criterion such as overall log likelihood that does not reflect performance. This presentation provides an overview of the MCE framework, and describes recent MCE speech recognition results obtained with both the MIT Galaxy system and the NTT Communication Science Labs SOLON system. The tasks examined include a 33 000-word vocabulary telephone-based spontaneous speech weather information task, a 22 000-word telephone-based name recognition task, and a 100 000-word Japanese lecture speech transcription task.
ISSN:0001-4966
1520-8524
DOI:10.1121/1.4787215