Loading…
Automatic detection of the features [high] and [low] in a landmark-based model of speech perception
This research is part of a landmark-based approach to modeling speech perception in which sound segments are assumed to be represented as bundles of binary distinctive features. In this model, probability estimates for feature values are derived from measurements of the acoustics in the vicinity of...
Saved in:
Published in: | The Journal of the Acoustical Society of America 2004-05, Vol.115 (5_Supplement), p.2428-2428 |
---|---|
Main Author: | |
Format: | Article |
Language: | English |
Online Access: | Get full text |
Tags: |
Add Tag
No Tags, Be the first to tag this record!
|
Summary: | This research is part of a landmark-based approach to modeling speech perception in which sound segments are assumed to be represented as bundles of binary distinctive features. In this model, probability estimates for feature values are derived from measurements of the acoustics in the vicinity of landmarks. The goal of the current project is to automatically detect the features [high] and [low] for vowel segments based on measurements from average spectra. A long-term and a short-term average spectrum are computed using all vowel regions in the utterance and are used to estimate speaker-specific parameters such as average F0 and average F3 (an indicator of vocal tract length). These parameters are used to estimate F1 using a peak-picking process on the average spectrum at each vowel-landmark. Preliminary results are derived from read connected speech for 738 vowels from 80 utterances (two male speakers, two female speakers). Speaker-independent logistic regression analysis using only average F0 and F1 determines the feature [high] with 73% accuracy and the feature [low] with 84% accuracy. Proposals are made for methods to use additional spectral detail to create a more robust estimate for vowels which show significant formant movement. [Work supported by NIH Grant No. DC02978.] |
---|---|
ISSN: | 0001-4966 1520-8524 |
DOI: | 10.1121/1.4781446 |