Loading…

Speaker-adapted training on the Switchboard Corpus

Speaker adaptation is the process of transforming some speaker-independent acoustic model in such a way as to more closely match the characteristics of a particular speaker. It has been shown by several researchers to be an effective means of improving the performance of large vocabulary continuous...

Full description

Saved in:
Bibliographic Details
Main Authors: McDonough, J., Anastasakos, T., Zavaliagkos, G., Gish, H.
Format: Conference Proceeding
Language:English
Subjects:
Online Access:Request full text
Tags: Add Tag
No Tags, Be the first to tag this record!
cited_by
cites
container_end_page 1062 vol.2
container_issue
container_start_page 1059
container_title
container_volume 2
creator McDonough, J.
Anastasakos, T.
Zavaliagkos, G.
Gish, H.
description Speaker adaptation is the process of transforming some speaker-independent acoustic model in such a way as to more closely match the characteristics of a particular speaker. It has been shown by several researchers to be an effective means of improving the performance of large vocabulary continuous speech recognition systems. Until very recently speaker adaptation has been used exclusively as a part of the recognition process. This is undesirable inasmuch as it leads to a mismatched condition between test and training, and hence sub-optimal recognition performance. There has been a growing interest in applying speaker-adaptation techniques to HMM training in order to alleviate the training/test mismatch. In prior work, we presented an iterative scheme for determining the maximum likelihood solution for the set of speaker-independent means and variances when speaker-dependent adaptation is performed during HMM training. In the present work, we investigate specific issues encountered in applying this general framework to the task of improving recognition performance on the Switchboard Corpus.
doi_str_mv 10.1109/ICASSP.1997.596123
format conference_proceeding
fullrecord <record><control><sourceid>pascalfrancis_6IE</sourceid><recordid>TN_cdi_ieee_primary_596123</recordid><sourceformat>XML</sourceformat><sourcesystem>PC</sourcesystem><ieee_id>596123</ieee_id><sourcerecordid>2277459</sourcerecordid><originalsourceid>FETCH-LOGICAL-i489-27d12318f1765002319a9166b7f7d6bd1cda9b4a5024b7a3757f693679092a8b3</originalsourceid><addsrcrecordid>eNo9kM1LxDAQxYMf4LruP7CnHrx2zaRNpnOU4hcsKHQP3pZpk7rRtS1pRfzvDVR8l5nHPIYfT4g1yA2ApJun8raqXjZAhBtNBlR2IhYqQ0qB5OupuJQFFAYpujOxAK1kaiCnC7Eax3cZpTWSNAuhqsHxhwspWx4mZ5MpsO9895b0XTIdXFJ9-6k51D0Hm5R9GL7GK3He8nF0q7-5FLv7u135mG6fHyLXNvV5QalCG6mgaAGNljKuxATG1NiiNbWFxjLVOWup8ho5Q42toSwyS1Jc1NlSXM9vBx4bPraBu8aP-yH4Tw4_e6UQc00xtp5j3jn3f507yX4Bb2lRfw</addsrcrecordid><sourcetype>Index Database</sourcetype><iscdi>true</iscdi><recordtype>conference_proceeding</recordtype></control><display><type>conference_proceeding</type><title>Speaker-adapted training on the Switchboard Corpus</title><source>IEEE Electronic Library (IEL) Conference Proceedings</source><creator>McDonough, J. ; Anastasakos, T. ; Zavaliagkos, G. ; Gish, H.</creator><creatorcontrib>McDonough, J. ; Anastasakos, T. ; Zavaliagkos, G. ; Gish, H.</creatorcontrib><description>Speaker adaptation is the process of transforming some speaker-independent acoustic model in such a way as to more closely match the characteristics of a particular speaker. It has been shown by several researchers to be an effective means of improving the performance of large vocabulary continuous speech recognition systems. Until very recently speaker adaptation has been used exclusively as a part of the recognition process. This is undesirable inasmuch as it leads to a mismatched condition between test and training, and hence sub-optimal recognition performance. There has been a growing interest in applying speaker-adaptation techniques to HMM training in order to alleviate the training/test mismatch. In prior work, we presented an iterative scheme for determining the maximum likelihood solution for the set of speaker-independent means and variances when speaker-dependent adaptation is performed during HMM training. In the present work, we investigate specific issues encountered in applying this general framework to the task of improving recognition performance on the Switchboard Corpus.</description><identifier>ISSN: 1520-6149</identifier><identifier>ISBN: 0818679190</identifier><identifier>ISBN: 9780818679193</identifier><identifier>EISSN: 2379-190X</identifier><identifier>DOI: 10.1109/ICASSP.1997.596123</identifier><language>eng</language><publisher>Washington DC: IEEE</publisher><subject>Applied sciences ; Electronic mail ; Exact sciences and technology ; Hidden Markov models ; Information, signal and communications theory ; Loudspeakers ; Signal processing ; Speech processing ; Speech processing and communication systems ; Speech recognition ; Telecommunications and information theory ; Testing ; Vocabulary</subject><ispartof>1997 IEEE International Conference on Acoustics, Speech, and Signal Processing, 1997, Vol.2, p.1059-1062 vol.2</ispartof><rights>1998 INIST-CNRS</rights><woscitedreferencessubscribed>false</woscitedreferencessubscribed></display><links><openurl>$$Topenurl_article</openurl><openurlfulltext>$$Topenurlfull_article</openurlfulltext><thumbnail>$$Tsyndetics_thumb_exl</thumbnail><linktohtml>$$Uhttps://ieeexplore.ieee.org/document/596123$$EHTML$$P50$$Gieee$$H</linktohtml><link.rule.ids>309,310,780,784,789,790,2058,4050,4051,27925,54555,54920,54932</link.rule.ids><linktorsrc>$$Uhttps://ieeexplore.ieee.org/document/596123$$EView_record_in_IEEE$$FView_record_in_$$GIEEE</linktorsrc><backlink>$$Uhttp://pascal-francis.inist.fr/vibad/index.php?action=getRecordDetail&amp;idt=2277459$$DView record in Pascal Francis$$Hfree_for_read</backlink></links><search><creatorcontrib>McDonough, J.</creatorcontrib><creatorcontrib>Anastasakos, T.</creatorcontrib><creatorcontrib>Zavaliagkos, G.</creatorcontrib><creatorcontrib>Gish, H.</creatorcontrib><title>Speaker-adapted training on the Switchboard Corpus</title><title>1997 IEEE International Conference on Acoustics, Speech, and Signal Processing</title><addtitle>ICASSP</addtitle><description>Speaker adaptation is the process of transforming some speaker-independent acoustic model in such a way as to more closely match the characteristics of a particular speaker. It has been shown by several researchers to be an effective means of improving the performance of large vocabulary continuous speech recognition systems. Until very recently speaker adaptation has been used exclusively as a part of the recognition process. This is undesirable inasmuch as it leads to a mismatched condition between test and training, and hence sub-optimal recognition performance. There has been a growing interest in applying speaker-adaptation techniques to HMM training in order to alleviate the training/test mismatch. In prior work, we presented an iterative scheme for determining the maximum likelihood solution for the set of speaker-independent means and variances when speaker-dependent adaptation is performed during HMM training. In the present work, we investigate specific issues encountered in applying this general framework to the task of improving recognition performance on the Switchboard Corpus.</description><subject>Applied sciences</subject><subject>Electronic mail</subject><subject>Exact sciences and technology</subject><subject>Hidden Markov models</subject><subject>Information, signal and communications theory</subject><subject>Loudspeakers</subject><subject>Signal processing</subject><subject>Speech processing</subject><subject>Speech processing and communication systems</subject><subject>Speech recognition</subject><subject>Telecommunications and information theory</subject><subject>Testing</subject><subject>Vocabulary</subject><issn>1520-6149</issn><issn>2379-190X</issn><isbn>0818679190</isbn><isbn>9780818679193</isbn><fulltext>true</fulltext><rsrctype>conference_proceeding</rsrctype><creationdate>1997</creationdate><recordtype>conference_proceeding</recordtype><sourceid>6IE</sourceid><recordid>eNo9kM1LxDAQxYMf4LruP7CnHrx2zaRNpnOU4hcsKHQP3pZpk7rRtS1pRfzvDVR8l5nHPIYfT4g1yA2ApJun8raqXjZAhBtNBlR2IhYqQ0qB5OupuJQFFAYpujOxAK1kaiCnC7Eax3cZpTWSNAuhqsHxhwspWx4mZ5MpsO9895b0XTIdXFJ9-6k51D0Hm5R9GL7GK3He8nF0q7-5FLv7u135mG6fHyLXNvV5QalCG6mgaAGNljKuxATG1NiiNbWFxjLVOWup8ho5Q42toSwyS1Jc1NlSXM9vBx4bPraBu8aP-yH4Tw4_e6UQc00xtp5j3jn3f507yX4Bb2lRfw</recordid><startdate>1997</startdate><enddate>1997</enddate><creator>McDonough, J.</creator><creator>Anastasakos, T.</creator><creator>Zavaliagkos, G.</creator><creator>Gish, H.</creator><general>IEEE</general><general>IEEE Computer Society Press</general><scope>6IE</scope><scope>6IL</scope><scope>CBEJK</scope><scope>RIE</scope><scope>RIL</scope><scope>IQODW</scope></search><sort><creationdate>1997</creationdate><title>Speaker-adapted training on the Switchboard Corpus</title><author>McDonough, J. ; Anastasakos, T. ; Zavaliagkos, G. ; Gish, H.</author></sort><facets><frbrtype>5</frbrtype><frbrgroupid>cdi_FETCH-LOGICAL-i489-27d12318f1765002319a9166b7f7d6bd1cda9b4a5024b7a3757f693679092a8b3</frbrgroupid><rsrctype>conference_proceedings</rsrctype><prefilter>conference_proceedings</prefilter><language>eng</language><creationdate>1997</creationdate><topic>Applied sciences</topic><topic>Electronic mail</topic><topic>Exact sciences and technology</topic><topic>Hidden Markov models</topic><topic>Information, signal and communications theory</topic><topic>Loudspeakers</topic><topic>Signal processing</topic><topic>Speech processing</topic><topic>Speech processing and communication systems</topic><topic>Speech recognition</topic><topic>Telecommunications and information theory</topic><topic>Testing</topic><topic>Vocabulary</topic><toplevel>online_resources</toplevel><creatorcontrib>McDonough, J.</creatorcontrib><creatorcontrib>Anastasakos, T.</creatorcontrib><creatorcontrib>Zavaliagkos, G.</creatorcontrib><creatorcontrib>Gish, H.</creatorcontrib><collection>IEEE Electronic Library (IEL) Conference Proceedings</collection><collection>IEEE Proceedings Order Plan All Online (POP All Online) 1998-present by volume</collection><collection>IEEE Xplore All Conference Proceedings</collection><collection>IEEE Xplore</collection><collection>IEEE Proceedings Order Plans (POP All) 1998-Present</collection><collection>Pascal-Francis</collection></facets><delivery><delcategory>Remote Search Resource</delcategory><fulltext>fulltext_linktorsrc</fulltext></delivery><addata><au>McDonough, J.</au><au>Anastasakos, T.</au><au>Zavaliagkos, G.</au><au>Gish, H.</au><format>book</format><genre>proceeding</genre><ristype>CONF</ristype><atitle>Speaker-adapted training on the Switchboard Corpus</atitle><btitle>1997 IEEE International Conference on Acoustics, Speech, and Signal Processing</btitle><stitle>ICASSP</stitle><date>1997</date><risdate>1997</risdate><volume>2</volume><spage>1059</spage><epage>1062 vol.2</epage><pages>1059-1062 vol.2</pages><issn>1520-6149</issn><eissn>2379-190X</eissn><isbn>0818679190</isbn><isbn>9780818679193</isbn><abstract>Speaker adaptation is the process of transforming some speaker-independent acoustic model in such a way as to more closely match the characteristics of a particular speaker. It has been shown by several researchers to be an effective means of improving the performance of large vocabulary continuous speech recognition systems. Until very recently speaker adaptation has been used exclusively as a part of the recognition process. This is undesirable inasmuch as it leads to a mismatched condition between test and training, and hence sub-optimal recognition performance. There has been a growing interest in applying speaker-adaptation techniques to HMM training in order to alleviate the training/test mismatch. In prior work, we presented an iterative scheme for determining the maximum likelihood solution for the set of speaker-independent means and variances when speaker-dependent adaptation is performed during HMM training. In the present work, we investigate specific issues encountered in applying this general framework to the task of improving recognition performance on the Switchboard Corpus.</abstract><cop>Washington DC</cop><pub>IEEE</pub><doi>10.1109/ICASSP.1997.596123</doi><tpages>4</tpages></addata></record>
fulltext fulltext_linktorsrc
identifier ISSN: 1520-6149
ispartof 1997 IEEE International Conference on Acoustics, Speech, and Signal Processing, 1997, Vol.2, p.1059-1062 vol.2
issn 1520-6149
2379-190X
language eng
recordid cdi_ieee_primary_596123
source IEEE Electronic Library (IEL) Conference Proceedings
subjects Applied sciences
Electronic mail
Exact sciences and technology
Hidden Markov models
Information, signal and communications theory
Loudspeakers
Signal processing
Speech processing
Speech processing and communication systems
Speech recognition
Telecommunications and information theory
Testing
Vocabulary
title Speaker-adapted training on the Switchboard Corpus
url http://sfxeu10.hosted.exlibrisgroup.com/loughborough?ctx_ver=Z39.88-2004&ctx_enc=info:ofi/enc:UTF-8&ctx_tim=2025-01-04T20%3A15%3A44IST&url_ver=Z39.88-2004&url_ctx_fmt=infofi/fmt:kev:mtx:ctx&rfr_id=info:sid/primo.exlibrisgroup.com:primo3-Article-pascalfrancis_6IE&rft_val_fmt=info:ofi/fmt:kev:mtx:book&rft.genre=proceeding&rft.atitle=Speaker-adapted%20training%20on%20the%20Switchboard%20Corpus&rft.btitle=1997%20IEEE%20International%20Conference%20on%20Acoustics,%20Speech,%20and%20Signal%20Processing&rft.au=McDonough,%20J.&rft.date=1997&rft.volume=2&rft.spage=1059&rft.epage=1062%20vol.2&rft.pages=1059-1062%20vol.2&rft.issn=1520-6149&rft.eissn=2379-190X&rft.isbn=0818679190&rft.isbn_list=9780818679193&rft_id=info:doi/10.1109/ICASSP.1997.596123&rft_dat=%3Cpascalfrancis_6IE%3E2277459%3C/pascalfrancis_6IE%3E%3Cgrp_id%3Ecdi_FETCH-LOGICAL-i489-27d12318f1765002319a9166b7f7d6bd1cda9b4a5024b7a3757f693679092a8b3%3C/grp_id%3E%3Coa%3E%3C/oa%3E%3Curl%3E%3C/url%3E&rft_id=info:oai/&rft_id=info:pmid/&rft_ieee_id=596123&rfr_iscdi=true