Loading…

Synthesizing speech acoustics from head and face motion

This work outlines a quantitative analysis of the relation between speech acoustics and the face and head motions that occur simultaneously [A. V. Barbosa, Ph.D. thesis, Universidade Federal de Minas Gerais, Belo Horizonte, Brazil, 2004]. 2-D motion data is obtained by means of a video camera. An al...

Full description

Saved in:

Bibliographic Details
Published in:	The Journal of the Acoustical Society of America 2005-04, Vol.117 (4_Supplement), p.2542-2542
Main Authors:	Barbosa, Adriano V., Yehia, Hani C., Daffertshofer, Andreas, Vatikiotis-Bateson, Eric
Format:	Article
Language:	English
Online Access:	Get full text
Tags:	Add Tag No Tags, Be the first to tag this record!

Description
Summary:	This work outlines a quantitative analysis of the relation between speech acoustics and the face and head motions that occur simultaneously [A. V. Barbosa, Ph.D. thesis, Universidade Federal de Minas Gerais, Belo Horizonte, Brazil, 2004]. 2-D motion data is obtained by means of a video camera. An algorithm has been developed for tracking markers on the speaker’s face from the acquired video sequence [A. V. Barbosa, E. Vatikiotis-Bateson, and A. Daffertshofer, in Proceedings of the 8th ICSLP Interspeech 2004, Korea, 2004]. The motion domain is represented by the 2-D marker trajectories, whereas line spectrum pairs (LSP) coefficients and the fundamental frequency F0 are used to represent the speech acoustics domain. Mathematical models are trained to estimate the acoustic parameters (LSPs + F0) from the motion parameters (2-D marker positions). The estimated acoustic parameters are then used to synthesize the acoustic speech signal. Cross-domain analysis for undecomposed (i.e., full head + face) and decomposed (i.e., separated head and face) normalized 2-D motions is performed. Syntheses from each method using intelligibility tests and qualitative comparison of the original and synthesized utterances are being evaluated.
ISSN:	0001-4966 1520-8524
DOI:	10.1121/1.4788446