Loading…
Factor analyzed voice models for HMM-based speech synthesis
This paper describes factor analyzed voice models for realizing various voice characteristics in the HMM-based speech synthesis. The eigenvoice method can synthesize speech with arbitrary voice characteristics by interpolating representative HMM sets. However, the objective of PCA is to accurately r...
Saved in:
Main Authors: | , , |
---|---|
Format: | Conference Proceeding |
Language: | English |
Subjects: | |
Online Access: | Request full text |
Tags: |
Add Tag
No Tags, Be the first to tag this record!
|
cited_by | |
---|---|
cites | |
container_end_page | 4237 |
container_issue | |
container_start_page | 4234 |
container_title | |
container_volume | |
creator | Kazumi, K Nankaku, Y Tokuda, K |
description | This paper describes factor analyzed voice models for realizing various voice characteristics in the HMM-based speech synthesis. The eigenvoice method can synthesize speech with arbitrary voice characteristics by interpolating representative HMM sets. However, the objective of PCA is to accurately reconstruct each speaker-dependent HMM set, and this is not equivalent to estimating models which represent training data accurately. To overcome this problem, we propose a general speech model which generates speech utterances with various voice characteristics directly. In the proposed method, the HMM states, factors representing voice characteristics and contextual decision trees are simultaneously optimized within a unified framework. |
doi_str_mv | 10.1109/ICASSP.2010.5495689 |
format | conference_proceeding |
fullrecord | <record><control><sourceid>ieee_6IE</sourceid><recordid>TN_cdi_ieee_primary_5495689</recordid><sourceformat>XML</sourceformat><sourcesystem>PC</sourcesystem><ieee_id>5495689</ieee_id><sourcerecordid>5495689</sourcerecordid><originalsourceid>FETCH-LOGICAL-i241t-cb9a2f6be3233b717fb030519521dad004448e66da2a9b61e741aaaecd80cfec3</originalsourceid><addsrcrecordid>eNpVkM1Kw0AUhcc_MNY-QTd5gdR7ZyYzGVxJsVZoUaiCu3Izc0MjaVMyQahP34DduDpwPjh8HCEmCFNEcA-vs6f1-n0qYShy7XJTuAsxdrZALbXW0hlzKRKprMvQwdfVP5a7a5FgLiEzqN2tuIvxGwAKq4tEPM7J922X0p6a4y-H9KetPae7NnAT02ogi9UqKykOKB6Y_TaNx32_5VjHe3FTURN5fM6R-Jw_f8wW2fLtZTBeZrXU2Ge-dCQrU7KSSpUWbVWCghxdLjFQABgsCzYmkCRXGmSrkYjYhwJ8xV6NxORvt2bmzaGrd9QdN-cf1AkNJU1u</addsrcrecordid><sourcetype>Publisher</sourcetype><iscdi>true</iscdi><recordtype>conference_proceeding</recordtype></control><display><type>conference_proceeding</type><title>Factor analyzed voice models for HMM-based speech synthesis</title><source>IEEE Electronic Library (IEL) Conference Proceedings</source><creator>Kazumi, K ; Nankaku, Y ; Tokuda, K</creator><creatorcontrib>Kazumi, K ; Nankaku, Y ; Tokuda, K</creatorcontrib><description>This paper describes factor analyzed voice models for realizing various voice characteristics in the HMM-based speech synthesis. The eigenvoice method can synthesize speech with arbitrary voice characteristics by interpolating representative HMM sets. However, the objective of PCA is to accurately reconstruct each speaker-dependent HMM set, and this is not equivalent to estimating models which represent training data accurately. To overcome this problem, we propose a general speech model which generates speech utterances with various voice characteristics directly. In the proposed method, the HMM states, factors representing voice characteristics and contextual decision trees are simultaneously optimized within a unified framework.</description><identifier>ISSN: 1520-6149</identifier><identifier>ISBN: 9781424442959</identifier><identifier>ISBN: 1424442958</identifier><identifier>EISSN: 2379-190X</identifier><identifier>EISBN: 9781424442966</identifier><identifier>EISBN: 1424442966</identifier><identifier>DOI: 10.1109/ICASSP.2010.5495689</identifier><language>eng</language><publisher>IEEE</publisher><subject>Algorithm design and analysis ; Annealing ; Character generation ; Decision trees ; deterministic annealing EM algorithm ; eigenvoice ; expectation maximization algorithm ; factor analysis ; Hidden Markov models ; HMM-based speech synthesis ; Maximum likelihood estimation ; Principal component analysis ; Speech analysis ; Speech synthesis ; Training data</subject><ispartof>2010 IEEE International Conference on Acoustics, Speech and Signal Processing, 2010, p.4234-4237</ispartof><woscitedreferencessubscribed>false</woscitedreferencessubscribed></display><links><openurl>$$Topenurl_article</openurl><openurlfulltext>$$Topenurlfull_article</openurlfulltext><thumbnail>$$Tsyndetics_thumb_exl</thumbnail><linktohtml>$$Uhttps://ieeexplore.ieee.org/document/5495689$$EHTML$$P50$$Gieee$$H</linktohtml><link.rule.ids>309,310,780,784,789,790,2058,27925,54555,54920,54932</link.rule.ids><linktorsrc>$$Uhttps://ieeexplore.ieee.org/document/5495689$$EView_record_in_IEEE$$FView_record_in_$$GIEEE</linktorsrc></links><search><creatorcontrib>Kazumi, K</creatorcontrib><creatorcontrib>Nankaku, Y</creatorcontrib><creatorcontrib>Tokuda, K</creatorcontrib><title>Factor analyzed voice models for HMM-based speech synthesis</title><title>2010 IEEE International Conference on Acoustics, Speech and Signal Processing</title><addtitle>ICASSP</addtitle><description>This paper describes factor analyzed voice models for realizing various voice characteristics in the HMM-based speech synthesis. The eigenvoice method can synthesize speech with arbitrary voice characteristics by interpolating representative HMM sets. However, the objective of PCA is to accurately reconstruct each speaker-dependent HMM set, and this is not equivalent to estimating models which represent training data accurately. To overcome this problem, we propose a general speech model which generates speech utterances with various voice characteristics directly. In the proposed method, the HMM states, factors representing voice characteristics and contextual decision trees are simultaneously optimized within a unified framework.</description><subject>Algorithm design and analysis</subject><subject>Annealing</subject><subject>Character generation</subject><subject>Decision trees</subject><subject>deterministic annealing EM algorithm</subject><subject>eigenvoice</subject><subject>expectation maximization algorithm</subject><subject>factor analysis</subject><subject>Hidden Markov models</subject><subject>HMM-based speech synthesis</subject><subject>Maximum likelihood estimation</subject><subject>Principal component analysis</subject><subject>Speech analysis</subject><subject>Speech synthesis</subject><subject>Training data</subject><issn>1520-6149</issn><issn>2379-190X</issn><isbn>9781424442959</isbn><isbn>1424442958</isbn><isbn>9781424442966</isbn><isbn>1424442966</isbn><fulltext>true</fulltext><rsrctype>conference_proceeding</rsrctype><creationdate>2010</creationdate><recordtype>conference_proceeding</recordtype><sourceid>6IE</sourceid><recordid>eNpVkM1Kw0AUhcc_MNY-QTd5gdR7ZyYzGVxJsVZoUaiCu3Izc0MjaVMyQahP34DduDpwPjh8HCEmCFNEcA-vs6f1-n0qYShy7XJTuAsxdrZALbXW0hlzKRKprMvQwdfVP5a7a5FgLiEzqN2tuIvxGwAKq4tEPM7J922X0p6a4y-H9KetPae7NnAT02ogi9UqKykOKB6Y_TaNx32_5VjHe3FTURN5fM6R-Jw_f8wW2fLtZTBeZrXU2Ge-dCQrU7KSSpUWbVWCghxdLjFQABgsCzYmkCRXGmSrkYjYhwJ8xV6NxORvt2bmzaGrd9QdN-cf1AkNJU1u</recordid><startdate>20100101</startdate><enddate>20100101</enddate><creator>Kazumi, K</creator><creator>Nankaku, Y</creator><creator>Tokuda, K</creator><general>IEEE</general><scope>6IE</scope><scope>6IH</scope><scope>CBEJK</scope><scope>RIE</scope><scope>RIO</scope></search><sort><creationdate>20100101</creationdate><title>Factor analyzed voice models for HMM-based speech synthesis</title><author>Kazumi, K ; Nankaku, Y ; Tokuda, K</author></sort><facets><frbrtype>5</frbrtype><frbrgroupid>cdi_FETCH-LOGICAL-i241t-cb9a2f6be3233b717fb030519521dad004448e66da2a9b61e741aaaecd80cfec3</frbrgroupid><rsrctype>conference_proceedings</rsrctype><prefilter>conference_proceedings</prefilter><language>eng</language><creationdate>2010</creationdate><topic>Algorithm design and analysis</topic><topic>Annealing</topic><topic>Character generation</topic><topic>Decision trees</topic><topic>deterministic annealing EM algorithm</topic><topic>eigenvoice</topic><topic>expectation maximization algorithm</topic><topic>factor analysis</topic><topic>Hidden Markov models</topic><topic>HMM-based speech synthesis</topic><topic>Maximum likelihood estimation</topic><topic>Principal component analysis</topic><topic>Speech analysis</topic><topic>Speech synthesis</topic><topic>Training data</topic><toplevel>online_resources</toplevel><creatorcontrib>Kazumi, K</creatorcontrib><creatorcontrib>Nankaku, Y</creatorcontrib><creatorcontrib>Tokuda, K</creatorcontrib><collection>IEEE Electronic Library (IEL) Conference Proceedings</collection><collection>IEEE Proceedings Order Plan (POP) 1998-present by volume</collection><collection>IEEE Xplore All Conference Proceedings</collection><collection>IEEE Xplore</collection><collection>IEEE Proceedings Order Plans (POP) 1998-present</collection></facets><delivery><delcategory>Remote Search Resource</delcategory><fulltext>fulltext_linktorsrc</fulltext></delivery><addata><au>Kazumi, K</au><au>Nankaku, Y</au><au>Tokuda, K</au><format>book</format><genre>proceeding</genre><ristype>CONF</ristype><atitle>Factor analyzed voice models for HMM-based speech synthesis</atitle><btitle>2010 IEEE International Conference on Acoustics, Speech and Signal Processing</btitle><stitle>ICASSP</stitle><date>2010-01-01</date><risdate>2010</risdate><spage>4234</spage><epage>4237</epage><pages>4234-4237</pages><issn>1520-6149</issn><eissn>2379-190X</eissn><isbn>9781424442959</isbn><isbn>1424442958</isbn><eisbn>9781424442966</eisbn><eisbn>1424442966</eisbn><abstract>This paper describes factor analyzed voice models for realizing various voice characteristics in the HMM-based speech synthesis. The eigenvoice method can synthesize speech with arbitrary voice characteristics by interpolating representative HMM sets. However, the objective of PCA is to accurately reconstruct each speaker-dependent HMM set, and this is not equivalent to estimating models which represent training data accurately. To overcome this problem, we propose a general speech model which generates speech utterances with various voice characteristics directly. In the proposed method, the HMM states, factors representing voice characteristics and contextual decision trees are simultaneously optimized within a unified framework.</abstract><pub>IEEE</pub><doi>10.1109/ICASSP.2010.5495689</doi><tpages>4</tpages></addata></record> |
fulltext | fulltext_linktorsrc |
identifier | ISSN: 1520-6149 |
ispartof | 2010 IEEE International Conference on Acoustics, Speech and Signal Processing, 2010, p.4234-4237 |
issn | 1520-6149 2379-190X |
language | eng |
recordid | cdi_ieee_primary_5495689 |
source | IEEE Electronic Library (IEL) Conference Proceedings |
subjects | Algorithm design and analysis Annealing Character generation Decision trees deterministic annealing EM algorithm eigenvoice expectation maximization algorithm factor analysis Hidden Markov models HMM-based speech synthesis Maximum likelihood estimation Principal component analysis Speech analysis Speech synthesis Training data |
title | Factor analyzed voice models for HMM-based speech synthesis |
url | http://sfxeu10.hosted.exlibrisgroup.com/loughborough?ctx_ver=Z39.88-2004&ctx_enc=info:ofi/enc:UTF-8&ctx_tim=2025-01-04T21%3A10%3A26IST&url_ver=Z39.88-2004&url_ctx_fmt=infofi/fmt:kev:mtx:ctx&rfr_id=info:sid/primo.exlibrisgroup.com:primo3-Article-ieee_6IE&rft_val_fmt=info:ofi/fmt:kev:mtx:book&rft.genre=proceeding&rft.atitle=Factor%20analyzed%20voice%20models%20for%20HMM-based%20speech%20synthesis&rft.btitle=2010%20IEEE%20International%20Conference%20on%20Acoustics,%20Speech%20and%20Signal%20Processing&rft.au=Kazumi,%20K&rft.date=2010-01-01&rft.spage=4234&rft.epage=4237&rft.pages=4234-4237&rft.issn=1520-6149&rft.eissn=2379-190X&rft.isbn=9781424442959&rft.isbn_list=1424442958&rft_id=info:doi/10.1109/ICASSP.2010.5495689&rft.eisbn=9781424442966&rft.eisbn_list=1424442966&rft_dat=%3Cieee_6IE%3E5495689%3C/ieee_6IE%3E%3Cgrp_id%3Ecdi_FETCH-LOGICAL-i241t-cb9a2f6be3233b717fb030519521dad004448e66da2a9b61e741aaaecd80cfec3%3C/grp_id%3E%3Coa%3E%3C/oa%3E%3Curl%3E%3C/url%3E&rft_id=info:oai/&rft_id=info:pmid/&rft_ieee_id=5495689&rfr_iscdi=true |