Loading…

Expanding the vocabulary of a connectionist recognizer trained on the DARPA Resource Management corpus

It is shown how the compositional representation (CR) previously used for lexical access from sub-word recognizers for a relatively small word vocabulary can be extended to much larger vocabularies without further training. This is demonstrated for the DARPA Resource Management database where, using...

Full description

Saved in:
Bibliographic Details
Main Authors: Lucke, H., Fallside, F.
Format: Conference Proceeding
Language:English
Subjects:
Online Access:Request full text
Tags: Add Tag
No Tags, Be the first to tag this record!
Description
Summary:It is shown how the compositional representation (CR) previously used for lexical access from sub-word recognizers for a relatively small word vocabulary can be extended to much larger vocabularies without further training. This is demonstrated for the DARPA Resource Management database where, using sub-word units as input, words are presented distributively over a fixed number of units and classified using a simple network. Initially, the architecture is trained on 147 words achieving an accuracy 91.2%. Then, leaving the recognizer unchanged, it is shown how additional output units can be added to the network to increase the vocabulary to the complete set of 975 phonetically distinct words. On this extended vocabulary the performance dropped to 66% but this drop is less than the expected drop due to the perplexity increase. Further improvement would be achieved by improving the performance on the original data set.< >
ISSN:1520-6149
2379-190X
DOI:10.1109/ICASSP.1992.225836