Loading…

Comparing glottal-flow-excited statistical parametric speech synthesis methods

This paper studies the performance of glottal flow signal based excitation methods in statistical parametric speech synthesis. The current state of the art in excitationmodeling is reviewed and three excitation methods are selected for experiments. Two of the methods are based on the principal compo...

Full description

Saved in:
Bibliographic Details
Main Authors: Raitio, Tuomo, Suni, Antti, Vainio, Martti, Alku, Paavo
Format: Conference Proceeding
Language:English
Subjects:
Online Access:Request full text
Tags: Add Tag
No Tags, Be the first to tag this record!
Description
Summary:This paper studies the performance of glottal flow signal based excitation methods in statistical parametric speech synthesis. The current state of the art in excitationmodeling is reviewed and three excitation methods are selected for experiments. Two of the methods are based on the principal component analysis (PCA) decomposition of estimated glottal flow pulses. While the first one uses only the mean of the pulses, the second method uses 12 principal components in addition to the mean signal for modeling the glottal flow waveform. The third method utilizes a glottal flow pulse library from which pulses are selected according to target and concatenation costs. Subjective listening tests are carried out to determine the quality and similarity of the synthetic speech of one male and one female speaker. The results show that the PCA-based methods are rated best both in quality and similarity, but adding more components does not yield any improvements.
ISSN:1520-6149
2379-190X
DOI:10.1109/ICASSP.2013.6639188