Loading…

Factor analyzed voice models for HMM-based speech synthesis

This paper describes factor analyzed voice models for realizing various voice characteristics in the HMM-based speech synthesis. The eigenvoice method can synthesize speech with arbitrary voice characteristics by interpolating representative HMM sets. However, the objective of PCA is to accurately r...

Full description

Saved in:
Bibliographic Details
Main Authors: Kazumi, K, Nankaku, Y, Tokuda, K
Format: Conference Proceeding
Language:English
Subjects:
Online Access:Request full text
Tags: Add Tag
No Tags, Be the first to tag this record!
cited_by
cites
container_end_page 4237
container_issue
container_start_page 4234
container_title
container_volume
creator Kazumi, K
Nankaku, Y
Tokuda, K
description This paper describes factor analyzed voice models for realizing various voice characteristics in the HMM-based speech synthesis. The eigenvoice method can synthesize speech with arbitrary voice characteristics by interpolating representative HMM sets. However, the objective of PCA is to accurately reconstruct each speaker-dependent HMM set, and this is not equivalent to estimating models which represent training data accurately. To overcome this problem, we propose a general speech model which generates speech utterances with various voice characteristics directly. In the proposed method, the HMM states, factors representing voice characteristics and contextual decision trees are simultaneously optimized within a unified framework.
doi_str_mv 10.1109/ICASSP.2010.5495689
format conference_proceeding
fullrecord <record><control><sourceid>ieee_6IE</sourceid><recordid>TN_cdi_ieee_primary_5495689</recordid><sourceformat>XML</sourceformat><sourcesystem>PC</sourcesystem><ieee_id>5495689</ieee_id><sourcerecordid>5495689</sourcerecordid><originalsourceid>FETCH-LOGICAL-i241t-cb9a2f6be3233b717fb030519521dad004448e66da2a9b61e741aaaecd80cfec3</originalsourceid><addsrcrecordid>eNpVkM1Kw0AUhcc_MNY-QTd5gdR7ZyYzGVxJsVZoUaiCu3Izc0MjaVMyQahP34DduDpwPjh8HCEmCFNEcA-vs6f1-n0qYShy7XJTuAsxdrZALbXW0hlzKRKprMvQwdfVP5a7a5FgLiEzqN2tuIvxGwAKq4tEPM7J922X0p6a4y-H9KetPae7NnAT02ogi9UqKykOKB6Y_TaNx32_5VjHe3FTURN5fM6R-Jw_f8wW2fLtZTBeZrXU2Ge-dCQrU7KSSpUWbVWCghxdLjFQABgsCzYmkCRXGmSrkYjYhwJ8xV6NxORvt2bmzaGrd9QdN-cf1AkNJU1u</addsrcrecordid><sourcetype>Publisher</sourcetype><iscdi>true</iscdi><recordtype>conference_proceeding</recordtype></control><display><type>conference_proceeding</type><title>Factor analyzed voice models for HMM-based speech synthesis</title><source>IEEE Electronic Library (IEL) Conference Proceedings</source><creator>Kazumi, K ; Nankaku, Y ; Tokuda, K</creator><creatorcontrib>Kazumi, K ; Nankaku, Y ; Tokuda, K</creatorcontrib><description>This paper describes factor analyzed voice models for realizing various voice characteristics in the HMM-based speech synthesis. The eigenvoice method can synthesize speech with arbitrary voice characteristics by interpolating representative HMM sets. However, the objective of PCA is to accurately reconstruct each speaker-dependent HMM set, and this is not equivalent to estimating models which represent training data accurately. To overcome this problem, we propose a general speech model which generates speech utterances with various voice characteristics directly. In the proposed method, the HMM states, factors representing voice characteristics and contextual decision trees are simultaneously optimized within a unified framework.</description><identifier>ISSN: 1520-6149</identifier><identifier>ISBN: 9781424442959</identifier><identifier>ISBN: 1424442958</identifier><identifier>EISSN: 2379-190X</identifier><identifier>EISBN: 9781424442966</identifier><identifier>EISBN: 1424442966</identifier><identifier>DOI: 10.1109/ICASSP.2010.5495689</identifier><language>eng</language><publisher>IEEE</publisher><subject>Algorithm design and analysis ; Annealing ; Character generation ; Decision trees ; deterministic annealing EM algorithm ; eigenvoice ; expectation maximization algorithm ; factor analysis ; Hidden Markov models ; HMM-based speech synthesis ; Maximum likelihood estimation ; Principal component analysis ; Speech analysis ; Speech synthesis ; Training data</subject><ispartof>2010 IEEE International Conference on Acoustics, Speech and Signal Processing, 2010, p.4234-4237</ispartof><woscitedreferencessubscribed>false</woscitedreferencessubscribed></display><links><openurl>$$Topenurl_article</openurl><openurlfulltext>$$Topenurlfull_article</openurlfulltext><thumbnail>$$Tsyndetics_thumb_exl</thumbnail><linktohtml>$$Uhttps://ieeexplore.ieee.org/document/5495689$$EHTML$$P50$$Gieee$$H</linktohtml><link.rule.ids>309,310,780,784,789,790,2058,27925,54555,54920,54932</link.rule.ids><linktorsrc>$$Uhttps://ieeexplore.ieee.org/document/5495689$$EView_record_in_IEEE$$FView_record_in_$$GIEEE</linktorsrc></links><search><creatorcontrib>Kazumi, K</creatorcontrib><creatorcontrib>Nankaku, Y</creatorcontrib><creatorcontrib>Tokuda, K</creatorcontrib><title>Factor analyzed voice models for HMM-based speech synthesis</title><title>2010 IEEE International Conference on Acoustics, Speech and Signal Processing</title><addtitle>ICASSP</addtitle><description>This paper describes factor analyzed voice models for realizing various voice characteristics in the HMM-based speech synthesis. The eigenvoice method can synthesize speech with arbitrary voice characteristics by interpolating representative HMM sets. However, the objective of PCA is to accurately reconstruct each speaker-dependent HMM set, and this is not equivalent to estimating models which represent training data accurately. To overcome this problem, we propose a general speech model which generates speech utterances with various voice characteristics directly. In the proposed method, the HMM states, factors representing voice characteristics and contextual decision trees are simultaneously optimized within a unified framework.</description><subject>Algorithm design and analysis</subject><subject>Annealing</subject><subject>Character generation</subject><subject>Decision trees</subject><subject>deterministic annealing EM algorithm</subject><subject>eigenvoice</subject><subject>expectation maximization algorithm</subject><subject>factor analysis</subject><subject>Hidden Markov models</subject><subject>HMM-based speech synthesis</subject><subject>Maximum likelihood estimation</subject><subject>Principal component analysis</subject><subject>Speech analysis</subject><subject>Speech synthesis</subject><subject>Training data</subject><issn>1520-6149</issn><issn>2379-190X</issn><isbn>9781424442959</isbn><isbn>1424442958</isbn><isbn>9781424442966</isbn><isbn>1424442966</isbn><fulltext>true</fulltext><rsrctype>conference_proceeding</rsrctype><creationdate>2010</creationdate><recordtype>conference_proceeding</recordtype><sourceid>6IE</sourceid><recordid>eNpVkM1Kw0AUhcc_MNY-QTd5gdR7ZyYzGVxJsVZoUaiCu3Izc0MjaVMyQahP34DduDpwPjh8HCEmCFNEcA-vs6f1-n0qYShy7XJTuAsxdrZALbXW0hlzKRKprMvQwdfVP5a7a5FgLiEzqN2tuIvxGwAKq4tEPM7J922X0p6a4y-H9KetPae7NnAT02ogi9UqKykOKB6Y_TaNx32_5VjHe3FTURN5fM6R-Jw_f8wW2fLtZTBeZrXU2Ge-dCQrU7KSSpUWbVWCghxdLjFQABgsCzYmkCRXGmSrkYjYhwJ8xV6NxORvt2bmzaGrd9QdN-cf1AkNJU1u</recordid><startdate>20100101</startdate><enddate>20100101</enddate><creator>Kazumi, K</creator><creator>Nankaku, Y</creator><creator>Tokuda, K</creator><general>IEEE</general><scope>6IE</scope><scope>6IH</scope><scope>CBEJK</scope><scope>RIE</scope><scope>RIO</scope></search><sort><creationdate>20100101</creationdate><title>Factor analyzed voice models for HMM-based speech synthesis</title><author>Kazumi, K ; Nankaku, Y ; Tokuda, K</author></sort><facets><frbrtype>5</frbrtype><frbrgroupid>cdi_FETCH-LOGICAL-i241t-cb9a2f6be3233b717fb030519521dad004448e66da2a9b61e741aaaecd80cfec3</frbrgroupid><rsrctype>conference_proceedings</rsrctype><prefilter>conference_proceedings</prefilter><language>eng</language><creationdate>2010</creationdate><topic>Algorithm design and analysis</topic><topic>Annealing</topic><topic>Character generation</topic><topic>Decision trees</topic><topic>deterministic annealing EM algorithm</topic><topic>eigenvoice</topic><topic>expectation maximization algorithm</topic><topic>factor analysis</topic><topic>Hidden Markov models</topic><topic>HMM-based speech synthesis</topic><topic>Maximum likelihood estimation</topic><topic>Principal component analysis</topic><topic>Speech analysis</topic><topic>Speech synthesis</topic><topic>Training data</topic><toplevel>online_resources</toplevel><creatorcontrib>Kazumi, K</creatorcontrib><creatorcontrib>Nankaku, Y</creatorcontrib><creatorcontrib>Tokuda, K</creatorcontrib><collection>IEEE Electronic Library (IEL) Conference Proceedings</collection><collection>IEEE Proceedings Order Plan (POP) 1998-present by volume</collection><collection>IEEE Xplore All Conference Proceedings</collection><collection>IEEE Xplore</collection><collection>IEEE Proceedings Order Plans (POP) 1998-present</collection></facets><delivery><delcategory>Remote Search Resource</delcategory><fulltext>fulltext_linktorsrc</fulltext></delivery><addata><au>Kazumi, K</au><au>Nankaku, Y</au><au>Tokuda, K</au><format>book</format><genre>proceeding</genre><ristype>CONF</ristype><atitle>Factor analyzed voice models for HMM-based speech synthesis</atitle><btitle>2010 IEEE International Conference on Acoustics, Speech and Signal Processing</btitle><stitle>ICASSP</stitle><date>2010-01-01</date><risdate>2010</risdate><spage>4234</spage><epage>4237</epage><pages>4234-4237</pages><issn>1520-6149</issn><eissn>2379-190X</eissn><isbn>9781424442959</isbn><isbn>1424442958</isbn><eisbn>9781424442966</eisbn><eisbn>1424442966</eisbn><abstract>This paper describes factor analyzed voice models for realizing various voice characteristics in the HMM-based speech synthesis. The eigenvoice method can synthesize speech with arbitrary voice characteristics by interpolating representative HMM sets. However, the objective of PCA is to accurately reconstruct each speaker-dependent HMM set, and this is not equivalent to estimating models which represent training data accurately. To overcome this problem, we propose a general speech model which generates speech utterances with various voice characteristics directly. In the proposed method, the HMM states, factors representing voice characteristics and contextual decision trees are simultaneously optimized within a unified framework.</abstract><pub>IEEE</pub><doi>10.1109/ICASSP.2010.5495689</doi><tpages>4</tpages></addata></record>
fulltext fulltext_linktorsrc
identifier ISSN: 1520-6149
ispartof 2010 IEEE International Conference on Acoustics, Speech and Signal Processing, 2010, p.4234-4237
issn 1520-6149
2379-190X
language eng
recordid cdi_ieee_primary_5495689
source IEEE Electronic Library (IEL) Conference Proceedings
subjects Algorithm design and analysis
Annealing
Character generation
Decision trees
deterministic annealing EM algorithm
eigenvoice
expectation maximization algorithm
factor analysis
Hidden Markov models
HMM-based speech synthesis
Maximum likelihood estimation
Principal component analysis
Speech analysis
Speech synthesis
Training data
title Factor analyzed voice models for HMM-based speech synthesis
url http://sfxeu10.hosted.exlibrisgroup.com/loughborough?ctx_ver=Z39.88-2004&ctx_enc=info:ofi/enc:UTF-8&ctx_tim=2025-01-04T21%3A10%3A26IST&url_ver=Z39.88-2004&url_ctx_fmt=infofi/fmt:kev:mtx:ctx&rfr_id=info:sid/primo.exlibrisgroup.com:primo3-Article-ieee_6IE&rft_val_fmt=info:ofi/fmt:kev:mtx:book&rft.genre=proceeding&rft.atitle=Factor%20analyzed%20voice%20models%20for%20HMM-based%20speech%20synthesis&rft.btitle=2010%20IEEE%20International%20Conference%20on%20Acoustics,%20Speech%20and%20Signal%20Processing&rft.au=Kazumi,%20K&rft.date=2010-01-01&rft.spage=4234&rft.epage=4237&rft.pages=4234-4237&rft.issn=1520-6149&rft.eissn=2379-190X&rft.isbn=9781424442959&rft.isbn_list=1424442958&rft_id=info:doi/10.1109/ICASSP.2010.5495689&rft.eisbn=9781424442966&rft.eisbn_list=1424442966&rft_dat=%3Cieee_6IE%3E5495689%3C/ieee_6IE%3E%3Cgrp_id%3Ecdi_FETCH-LOGICAL-i241t-cb9a2f6be3233b717fb030519521dad004448e66da2a9b61e741aaaecd80cfec3%3C/grp_id%3E%3Coa%3E%3C/oa%3E%3Curl%3E%3C/url%3E&rft_id=info:oai/&rft_id=info:pmid/&rft_ieee_id=5495689&rfr_iscdi=true