Loading…

Articulatory Information for Noise Robust Speech Recognition

Prior research has shown that articulatory information, if extracted properly from the speech signal, can improve the performance of automatic speech recognition systems. However, such information is not readily available in the signal. The challenge posed by the estimation of articulatory informati...

Full description

Saved in:
Bibliographic Details
Published in:IEEE transactions on audio, speech, and language processing speech, and language processing, 2011-09, Vol.19 (7), p.1913-1924
Main Authors: Mitra, V., Hosung Nam, Espy-Wilson, C. Y., Saltzman, E., Goldstein, L.
Format: Article
Language:English
Subjects:
Citations: Items that this one cites
Items that cite this one
Online Access:Get full text
Tags: Add Tag
No Tags, Be the first to tag this record!
Description
Summary:Prior research has shown that articulatory information, if extracted properly from the speech signal, can improve the performance of automatic speech recognition systems. However, such information is not readily available in the signal. The challenge posed by the estimation of articulatory information from speech acoustics has led to a new line of research known as "acoustic-to-articulatory inversion" or "speech-inversion." While most of the research in this area has focused on estimating articulatory information more accurately, few have explored ways to apply this information in speech recognition tasks. In this paper, we first estimated articulatory information in the form of vocal tract constriction variables (abbreviated as TVs) from the Aurora-2 speech corpus using a neural network based speech-inversion model. Word recognition tasks were then performed for both noisy and clean speech using articulatory information in conjunction with traditional acoustic features. Our results indicate that incorporating TVs can significantly improve word recognition rates when used in conjunction with traditional acoustic features.
ISSN:1558-7916
2329-9290
1558-7924
2329-9304
DOI:10.1109/TASL.2010.2103058