Loading…

Band importance for speech-in-speech recognition in the presence of extended high-frequency cues

Band importance functions for speech-in-noise recognition, typically determined in the presence of steady background noise, indicate a negligible role for extended high frequencies (EHFs; 8–20 kHz). However, recent findings indicate that EHF cues support speech recognition in multi-talker environmen...

Full description

Saved in:
Bibliographic Details
Published in:The Journal of the Acoustical Society of America 2024-08, Vol.156 (2), p.1202-1213
Main Authors: Ananthanarayana, Rohit M., Buss, Emily, Monson, Brian B.
Format: Article
Language:English
Subjects:
Online Access:Get full text
Tags: Add Tag
No Tags, Be the first to tag this record!
Description
Summary:Band importance functions for speech-in-noise recognition, typically determined in the presence of steady background noise, indicate a negligible role for extended high frequencies (EHFs; 8–20 kHz). However, recent findings indicate that EHF cues support speech recognition in multi-talker environments, particularly when the masker has reduced EHF levels relative to the target. This scenario can occur in natural auditory scenes when the target talker is facing the listener, but the maskers are not. In this study, we measured the importance of five bands from 40 to 20 000 Hz for speech-in-speech recognition by notch-filtering the bands individually. Stimuli consisted of a female target talker recorded from 0° and a spatially co-located two-talker female masker recorded either from 0° or 56.25°, simulating a masker either facing the listener or facing away, respectively. Results indicated peak band importance in the 0.4–1.3 kHz band and a negligible effect of removing the EHF band in the facing-masker condition. However, in the non-facing condition, the peak was broader and EHF importance was higher and comparable to that of the 3.3–8.3 kHz band in the facing-masker condition. These findings suggest that EHFs contain important cues for speech recognition in listening conditions with mismatched talker head orientations.
ISSN:0001-4966
1520-8524
1520-8524
DOI:10.1121/10.0028269