Loading…
Application of multidimensional scaling to subjective evaluation of coded speech
We present results from a pilot study directed at developing an anchorable subjective speech quality test. The test uses multidimensional scaling techniques to obtain quantitative information about the perceptual attributes of speech. In the first phase of the study, subjects ranked perceptual dista...
Saved in:
Published in: | The Journal of the Acoustical Society of America 2001-10, Vol.110 (4), p.2167-2182 |
---|---|
Main Author: | |
Format: | Article |
Language: | English |
Subjects: | |
Citations: | Items that this one cites Items that cite this one |
Online Access: | Get full text |
Tags: |
Add Tag
No Tags, Be the first to tag this record!
|
Summary: | We present results from a pilot study directed at developing an anchorable subjective speech quality test. The test uses multidimensional scaling techniques to obtain quantitative information about the perceptual attributes of speech. In the first phase of the study, subjects ranked perceptual distances between samples of speech produced by two different talkers, one male and one female, processed by a variety of codecs. The resulting distance matrices were processed to obtain, for each talker, a stimulus space for the various speech samples. This stimulus space has the properties that distances between stimuli in this space correspond to perceptual distances between stimuli and that the dimensions of this space correspond to attributes used by the subjects in determining perceptual distances. Mean opinion scores (MOS) scores obtained in an earlier study were found to be highly correlated with position in the stimulus space, and the three dimensions of the stimulus space were found to have identifiable physical and perceptual correlates. In the second phase of the study, we developed techniques for fitting speech generated by a new codec under investigation into a previously established stimulus space. The user is provided with a collection of speech samples and with the stimulus space for these speech samples as determined by a large-scale listening test. The user then carries out a much smaller listening test to determine the position of the new stimulus in the previously established stimulus space. This system is anchorable, so that different versions of a codec under development can be compared directly, and it provides more detailed information than the single number provided by MOS testing. We suggest that this information could be used to advantage in algorithm development and in development of objective measures of speech quality. |
---|---|
ISSN: | 0001-4966 1520-8524 |
DOI: | 10.1121/1.1397322 |