Loading…
Automatic segmentation and labeling of speech
The authors investigate an automatic approach to segmentation of labeled speech and labeling and segmentation of speech when only the orthographic transcription of speech is available. The technique is based on a phone recognition system based on a trigram phonotactic model, gamma distribution phone...
Saved in:
Main Authors: | , |
---|---|
Format: | Conference Proceeding |
Language: | English |
Subjects: | |
Online Access: | Request full text |
Tags: |
Add Tag
No Tags, Be the first to tag this record!
|
Summary: | The authors investigate an automatic approach to segmentation of labeled speech and labeling and segmentation of speech when only the orthographic transcription of speech is available. The technique is based on a phone recognition system based on a trigram phonotactic model, gamma distribution phone duration models, and a spectral model based on five different structures for phone models of varying contextual dependencies. The alignment of speech with a given phone sequence is performed as a very constrained phone recognition task with the phonotactic model based only on the given phone sequence. When only orthographic transcription is provided, a classification-tree-based prediction of most likely phone realizations is used as an input network for the phone recognizer. The maximum likelihood phone sequence is then treated as the true phone sequence and its segment boundaries are compared with the reference boundaries.< > |
---|---|
ISSN: | 1520-6149 2379-190X |
DOI: | 10.1109/ICASSP.1991.150379 |