Loading…
Speaker-adapted training on the Switchboard Corpus
Speaker adaptation is the process of transforming some speaker-independent acoustic model in such a way as to more closely match the characteristics of a particular speaker. It has been shown by several researchers to be an effective means of improving the performance of large vocabulary continuous...
Saved in:
Main Authors: | , , , |
---|---|
Format: | Conference Proceeding |
Language: | English |
Subjects: | |
Online Access: | Request full text |
Tags: |
Add Tag
No Tags, Be the first to tag this record!
|
cited_by | |
---|---|
cites | |
container_end_page | 1062 vol.2 |
container_issue | |
container_start_page | 1059 |
container_title | |
container_volume | 2 |
creator | McDonough, J. Anastasakos, T. Zavaliagkos, G. Gish, H. |
description | Speaker adaptation is the process of transforming some speaker-independent acoustic model in such a way as to more closely match the characteristics of a particular speaker. It has been shown by several researchers to be an effective means of improving the performance of large vocabulary continuous speech recognition systems. Until very recently speaker adaptation has been used exclusively as a part of the recognition process. This is undesirable inasmuch as it leads to a mismatched condition between test and training, and hence sub-optimal recognition performance. There has been a growing interest in applying speaker-adaptation techniques to HMM training in order to alleviate the training/test mismatch. In prior work, we presented an iterative scheme for determining the maximum likelihood solution for the set of speaker-independent means and variances when speaker-dependent adaptation is performed during HMM training. In the present work, we investigate specific issues encountered in applying this general framework to the task of improving recognition performance on the Switchboard Corpus. |
doi_str_mv | 10.1109/ICASSP.1997.596123 |
format | conference_proceeding |
fullrecord | <record><control><sourceid>pascalfrancis_6IE</sourceid><recordid>TN_cdi_ieee_primary_596123</recordid><sourceformat>XML</sourceformat><sourcesystem>PC</sourcesystem><ieee_id>596123</ieee_id><sourcerecordid>2277459</sourcerecordid><originalsourceid>FETCH-LOGICAL-i489-27d12318f1765002319a9166b7f7d6bd1cda9b4a5024b7a3757f693679092a8b3</originalsourceid><addsrcrecordid>eNo9kM1LxDAQxYMf4LruP7CnHrx2zaRNpnOU4hcsKHQP3pZpk7rRtS1pRfzvDVR8l5nHPIYfT4g1yA2ApJun8raqXjZAhBtNBlR2IhYqQ0qB5OupuJQFFAYpujOxAK1kaiCnC7Eax3cZpTWSNAuhqsHxhwspWx4mZ5MpsO9895b0XTIdXFJ9-6k51D0Hm5R9GL7GK3He8nF0q7-5FLv7u135mG6fHyLXNvV5QalCG6mgaAGNljKuxATG1NiiNbWFxjLVOWup8ho5Q42toSwyS1Jc1NlSXM9vBx4bPraBu8aP-yH4Tw4_e6UQc00xtp5j3jn3f507yX4Bb2lRfw</addsrcrecordid><sourcetype>Index Database</sourcetype><iscdi>true</iscdi><recordtype>conference_proceeding</recordtype></control><display><type>conference_proceeding</type><title>Speaker-adapted training on the Switchboard Corpus</title><source>IEEE Electronic Library (IEL) Conference Proceedings</source><creator>McDonough, J. ; Anastasakos, T. ; Zavaliagkos, G. ; Gish, H.</creator><creatorcontrib>McDonough, J. ; Anastasakos, T. ; Zavaliagkos, G. ; Gish, H.</creatorcontrib><description>Speaker adaptation is the process of transforming some speaker-independent acoustic model in such a way as to more closely match the characteristics of a particular speaker. It has been shown by several researchers to be an effective means of improving the performance of large vocabulary continuous speech recognition systems. Until very recently speaker adaptation has been used exclusively as a part of the recognition process. This is undesirable inasmuch as it leads to a mismatched condition between test and training, and hence sub-optimal recognition performance. There has been a growing interest in applying speaker-adaptation techniques to HMM training in order to alleviate the training/test mismatch. In prior work, we presented an iterative scheme for determining the maximum likelihood solution for the set of speaker-independent means and variances when speaker-dependent adaptation is performed during HMM training. In the present work, we investigate specific issues encountered in applying this general framework to the task of improving recognition performance on the Switchboard Corpus.</description><identifier>ISSN: 1520-6149</identifier><identifier>ISBN: 0818679190</identifier><identifier>ISBN: 9780818679193</identifier><identifier>EISSN: 2379-190X</identifier><identifier>DOI: 10.1109/ICASSP.1997.596123</identifier><language>eng</language><publisher>Washington DC: IEEE</publisher><subject>Applied sciences ; Electronic mail ; Exact sciences and technology ; Hidden Markov models ; Information, signal and communications theory ; Loudspeakers ; Signal processing ; Speech processing ; Speech processing and communication systems ; Speech recognition ; Telecommunications and information theory ; Testing ; Vocabulary</subject><ispartof>1997 IEEE International Conference on Acoustics, Speech, and Signal Processing, 1997, Vol.2, p.1059-1062 vol.2</ispartof><rights>1998 INIST-CNRS</rights><woscitedreferencessubscribed>false</woscitedreferencessubscribed></display><links><openurl>$$Topenurl_article</openurl><openurlfulltext>$$Topenurlfull_article</openurlfulltext><thumbnail>$$Tsyndetics_thumb_exl</thumbnail><linktohtml>$$Uhttps://ieeexplore.ieee.org/document/596123$$EHTML$$P50$$Gieee$$H</linktohtml><link.rule.ids>309,310,780,784,789,790,2058,4050,4051,27925,54555,54920,54932</link.rule.ids><linktorsrc>$$Uhttps://ieeexplore.ieee.org/document/596123$$EView_record_in_IEEE$$FView_record_in_$$GIEEE</linktorsrc><backlink>$$Uhttp://pascal-francis.inist.fr/vibad/index.php?action=getRecordDetail&idt=2277459$$DView record in Pascal Francis$$Hfree_for_read</backlink></links><search><creatorcontrib>McDonough, J.</creatorcontrib><creatorcontrib>Anastasakos, T.</creatorcontrib><creatorcontrib>Zavaliagkos, G.</creatorcontrib><creatorcontrib>Gish, H.</creatorcontrib><title>Speaker-adapted training on the Switchboard Corpus</title><title>1997 IEEE International Conference on Acoustics, Speech, and Signal Processing</title><addtitle>ICASSP</addtitle><description>Speaker adaptation is the process of transforming some speaker-independent acoustic model in such a way as to more closely match the characteristics of a particular speaker. It has been shown by several researchers to be an effective means of improving the performance of large vocabulary continuous speech recognition systems. Until very recently speaker adaptation has been used exclusively as a part of the recognition process. This is undesirable inasmuch as it leads to a mismatched condition between test and training, and hence sub-optimal recognition performance. There has been a growing interest in applying speaker-adaptation techniques to HMM training in order to alleviate the training/test mismatch. In prior work, we presented an iterative scheme for determining the maximum likelihood solution for the set of speaker-independent means and variances when speaker-dependent adaptation is performed during HMM training. In the present work, we investigate specific issues encountered in applying this general framework to the task of improving recognition performance on the Switchboard Corpus.</description><subject>Applied sciences</subject><subject>Electronic mail</subject><subject>Exact sciences and technology</subject><subject>Hidden Markov models</subject><subject>Information, signal and communications theory</subject><subject>Loudspeakers</subject><subject>Signal processing</subject><subject>Speech processing</subject><subject>Speech processing and communication systems</subject><subject>Speech recognition</subject><subject>Telecommunications and information theory</subject><subject>Testing</subject><subject>Vocabulary</subject><issn>1520-6149</issn><issn>2379-190X</issn><isbn>0818679190</isbn><isbn>9780818679193</isbn><fulltext>true</fulltext><rsrctype>conference_proceeding</rsrctype><creationdate>1997</creationdate><recordtype>conference_proceeding</recordtype><sourceid>6IE</sourceid><recordid>eNo9kM1LxDAQxYMf4LruP7CnHrx2zaRNpnOU4hcsKHQP3pZpk7rRtS1pRfzvDVR8l5nHPIYfT4g1yA2ApJun8raqXjZAhBtNBlR2IhYqQ0qB5OupuJQFFAYpujOxAK1kaiCnC7Eax3cZpTWSNAuhqsHxhwspWx4mZ5MpsO9895b0XTIdXFJ9-6k51D0Hm5R9GL7GK3He8nF0q7-5FLv7u135mG6fHyLXNvV5QalCG6mgaAGNljKuxATG1NiiNbWFxjLVOWup8ho5Q42toSwyS1Jc1NlSXM9vBx4bPraBu8aP-yH4Tw4_e6UQc00xtp5j3jn3f507yX4Bb2lRfw</recordid><startdate>1997</startdate><enddate>1997</enddate><creator>McDonough, J.</creator><creator>Anastasakos, T.</creator><creator>Zavaliagkos, G.</creator><creator>Gish, H.</creator><general>IEEE</general><general>IEEE Computer Society Press</general><scope>6IE</scope><scope>6IL</scope><scope>CBEJK</scope><scope>RIE</scope><scope>RIL</scope><scope>IQODW</scope></search><sort><creationdate>1997</creationdate><title>Speaker-adapted training on the Switchboard Corpus</title><author>McDonough, J. ; Anastasakos, T. ; Zavaliagkos, G. ; Gish, H.</author></sort><facets><frbrtype>5</frbrtype><frbrgroupid>cdi_FETCH-LOGICAL-i489-27d12318f1765002319a9166b7f7d6bd1cda9b4a5024b7a3757f693679092a8b3</frbrgroupid><rsrctype>conference_proceedings</rsrctype><prefilter>conference_proceedings</prefilter><language>eng</language><creationdate>1997</creationdate><topic>Applied sciences</topic><topic>Electronic mail</topic><topic>Exact sciences and technology</topic><topic>Hidden Markov models</topic><topic>Information, signal and communications theory</topic><topic>Loudspeakers</topic><topic>Signal processing</topic><topic>Speech processing</topic><topic>Speech processing and communication systems</topic><topic>Speech recognition</topic><topic>Telecommunications and information theory</topic><topic>Testing</topic><topic>Vocabulary</topic><toplevel>online_resources</toplevel><creatorcontrib>McDonough, J.</creatorcontrib><creatorcontrib>Anastasakos, T.</creatorcontrib><creatorcontrib>Zavaliagkos, G.</creatorcontrib><creatorcontrib>Gish, H.</creatorcontrib><collection>IEEE Electronic Library (IEL) Conference Proceedings</collection><collection>IEEE Proceedings Order Plan All Online (POP All Online) 1998-present by volume</collection><collection>IEEE Xplore All Conference Proceedings</collection><collection>IEEE Xplore</collection><collection>IEEE Proceedings Order Plans (POP All) 1998-Present</collection><collection>Pascal-Francis</collection></facets><delivery><delcategory>Remote Search Resource</delcategory><fulltext>fulltext_linktorsrc</fulltext></delivery><addata><au>McDonough, J.</au><au>Anastasakos, T.</au><au>Zavaliagkos, G.</au><au>Gish, H.</au><format>book</format><genre>proceeding</genre><ristype>CONF</ristype><atitle>Speaker-adapted training on the Switchboard Corpus</atitle><btitle>1997 IEEE International Conference on Acoustics, Speech, and Signal Processing</btitle><stitle>ICASSP</stitle><date>1997</date><risdate>1997</risdate><volume>2</volume><spage>1059</spage><epage>1062 vol.2</epage><pages>1059-1062 vol.2</pages><issn>1520-6149</issn><eissn>2379-190X</eissn><isbn>0818679190</isbn><isbn>9780818679193</isbn><abstract>Speaker adaptation is the process of transforming some speaker-independent acoustic model in such a way as to more closely match the characteristics of a particular speaker. It has been shown by several researchers to be an effective means of improving the performance of large vocabulary continuous speech recognition systems. Until very recently speaker adaptation has been used exclusively as a part of the recognition process. This is undesirable inasmuch as it leads to a mismatched condition between test and training, and hence sub-optimal recognition performance. There has been a growing interest in applying speaker-adaptation techniques to HMM training in order to alleviate the training/test mismatch. In prior work, we presented an iterative scheme for determining the maximum likelihood solution for the set of speaker-independent means and variances when speaker-dependent adaptation is performed during HMM training. In the present work, we investigate specific issues encountered in applying this general framework to the task of improving recognition performance on the Switchboard Corpus.</abstract><cop>Washington DC</cop><pub>IEEE</pub><doi>10.1109/ICASSP.1997.596123</doi><tpages>4</tpages></addata></record> |
fulltext | fulltext_linktorsrc |
identifier | ISSN: 1520-6149 |
ispartof | 1997 IEEE International Conference on Acoustics, Speech, and Signal Processing, 1997, Vol.2, p.1059-1062 vol.2 |
issn | 1520-6149 2379-190X |
language | eng |
recordid | cdi_ieee_primary_596123 |
source | IEEE Electronic Library (IEL) Conference Proceedings |
subjects | Applied sciences Electronic mail Exact sciences and technology Hidden Markov models Information, signal and communications theory Loudspeakers Signal processing Speech processing Speech processing and communication systems Speech recognition Telecommunications and information theory Testing Vocabulary |
title | Speaker-adapted training on the Switchboard Corpus |
url | http://sfxeu10.hosted.exlibrisgroup.com/loughborough?ctx_ver=Z39.88-2004&ctx_enc=info:ofi/enc:UTF-8&ctx_tim=2025-01-04T20%3A15%3A44IST&url_ver=Z39.88-2004&url_ctx_fmt=infofi/fmt:kev:mtx:ctx&rfr_id=info:sid/primo.exlibrisgroup.com:primo3-Article-pascalfrancis_6IE&rft_val_fmt=info:ofi/fmt:kev:mtx:book&rft.genre=proceeding&rft.atitle=Speaker-adapted%20training%20on%20the%20Switchboard%20Corpus&rft.btitle=1997%20IEEE%20International%20Conference%20on%20Acoustics,%20Speech,%20and%20Signal%20Processing&rft.au=McDonough,%20J.&rft.date=1997&rft.volume=2&rft.spage=1059&rft.epage=1062%20vol.2&rft.pages=1059-1062%20vol.2&rft.issn=1520-6149&rft.eissn=2379-190X&rft.isbn=0818679190&rft.isbn_list=9780818679193&rft_id=info:doi/10.1109/ICASSP.1997.596123&rft_dat=%3Cpascalfrancis_6IE%3E2277459%3C/pascalfrancis_6IE%3E%3Cgrp_id%3Ecdi_FETCH-LOGICAL-i489-27d12318f1765002319a9166b7f7d6bd1cda9b4a5024b7a3757f693679092a8b3%3C/grp_id%3E%3Coa%3E%3C/oa%3E%3Curl%3E%3C/url%3E&rft_id=info:oai/&rft_id=info:pmid/&rft_ieee_id=596123&rfr_iscdi=true |