Loading…
Delta-spectral cepstral coefficients for robust speech recognition
Almost all current automatic speech recognition (ASR) systems conventionally append delta and double-delta cepstral features to static cepstral features. In this work we describe a modified feature-extraction procedure in which the time-difference operation is performed in the spectral domain, rathe...
Saved in:
Main Authors: | , , |
---|---|
Format: | Conference Proceeding |
Language: | English |
Subjects: | |
Online Access: | Request full text |
Tags: |
Add Tag
No Tags, Be the first to tag this record!
|
Summary: | Almost all current automatic speech recognition (ASR) systems conventionally append delta and double-delta cepstral features to static cepstral features. In this work we describe a modified feature-extraction procedure in which the time-difference operation is performed in the spectral domain, rather than the cepstral domain as is generally presently done. We argue that this approach based on "delta-spectral" features is needed because even though delta-cepstral features capture dynamic speech information and generally greatly improve ASR recognition accuracy, they are not robust to noise and reverberation. We support the validity of the delta-spectral approach both with observations about the modulation spectrum of speech and noise, and with objective experiments that document the benefit that the delta-spectral approach brings to a variety of currently popular feature extraction algorithms. We found that the use of delta-spectral features, rather than the more traditional delta-cepstral features, improves the effective SNR by between 5 and 8 dB for background music and white noise, and recognition accuracy in reverberant environments is improved as well. |
---|---|
ISSN: | 1520-6149 2379-190X |
DOI: | 10.1109/ICASSP.2011.5947425 |