Loading…
Pronunciation modeling using a finite-state transducer representation
The MIT summit speech recognition system models pronunciation using a phonemic baseform dictionary along with rewrite rules for modeling phonological variation and multi-word reductions. Each pronunciation component is encoded within a finite-state transducer (FST) representation whose transition we...
Saved in:
Published in: | Speech communication 2005-06, Vol.46 (2), p.189-203 |
---|---|
Main Authors: | , , , |
Format: | Article |
Language: | English |
Subjects: | |
Citations: | Items that this one cites Items that cite this one |
Online Access: | Get full text |
Tags: |
Add Tag
No Tags, Be the first to tag this record!
|
cited_by | cdi_FETCH-LOGICAL-c475t-320e205d7154ad2bf14267ceaf1cb02aa0cdaf3f9d1ff2ce346924f5e98cca533 |
---|---|
cites | cdi_FETCH-LOGICAL-c475t-320e205d7154ad2bf14267ceaf1cb02aa0cdaf3f9d1ff2ce346924f5e98cca533 |
container_end_page | 203 |
container_issue | 2 |
container_start_page | 189 |
container_title | Speech communication |
container_volume | 46 |
creator | Hazen, Timothy J. Hetherington, I. Lee Shu, Han Livescu, Karen |
description | The MIT
summit speech recognition system models pronunciation using a phonemic baseform dictionary along with rewrite rules for modeling phonological variation and multi-word reductions. Each pronunciation component is encoded within a finite-state transducer (FST) representation whose transition weights can be trained using an EM algorithm for finite-state networks. This paper explains the modeling approach we use and the details of its realization. We demonstrate the benefits and weaknesses of the approach both conceptually and empirically using the recognizer for our
jupiter weather information system. Our experiments demonstrate that the use of phonological rewrite rules within our system achieves word error rate reductions between 4% and 9% over different test sets when compared against a system using no phonological rewrite rules. |
doi_str_mv | 10.1016/j.specom.2005.03.004 |
format | article |
fullrecord | <record><control><sourceid>proquest_cross</sourceid><recordid>TN_cdi_proquest_miscellaneous_85631071</recordid><sourceformat>XML</sourceformat><sourcesystem>PC</sourcesystem><els_id>S0167639305000361</els_id><sourcerecordid>85631071</sourcerecordid><originalsourceid>FETCH-LOGICAL-c475t-320e205d7154ad2bf14267ceaf1cb02aa0cdaf3f9d1ff2ce346924f5e98cca533</originalsourceid><addsrcrecordid>eNqNkUtLxDAUhYMoOI7-Axfd6K41jz7SjSDD-IABXeg6ZG5vJEObjkkq-O9treBO3dy7-c653HMIOWc0Y5SVV7ss7BH6LuOUFhkVGaX5AVkwWfG0YpIfksWIVWkpanFMTkLY0ZGQki_I-sn3bnBgdbS9S7q-wda612QI09SJsc5GTEPUEZPotQvNAOgTj3uPAV380p2SI6PbgGffe0lebtfPq_t083j3sLrZpJBXRUwFp8hp0VSsyHXDt4blvKwAtWGwpVxrCo02wtQNM4YDiryseW4KrCWALoRYksvZd-_7twFDVJ0NgG2rHfZDULIoBaMV-wcohCgr-SfI64LLMp_AfAbB9yF4NGrvbaf9h2JUTS2onZpbUFMLigo1ZjzKLr79dQDdmjFBsOFHOz4oZT19dj1zOMb3btGrABYdYGM9QlRNb38_9AkRqKCb</addsrcrecordid><sourcetype>Aggregation Database</sourcetype><iscdi>true</iscdi><recordtype>article</recordtype><pqid>29528648</pqid></control><display><type>article</type><title>Pronunciation modeling using a finite-state transducer representation</title><source>Elsevier</source><source>Linguistics and Language Behavior Abstracts (LLBA)</source><creator>Hazen, Timothy J. ; Hetherington, I. Lee ; Shu, Han ; Livescu, Karen</creator><creatorcontrib>Hazen, Timothy J. ; Hetherington, I. Lee ; Shu, Han ; Livescu, Karen</creatorcontrib><description>The MIT
summit speech recognition system models pronunciation using a phonemic baseform dictionary along with rewrite rules for modeling phonological variation and multi-word reductions. Each pronunciation component is encoded within a finite-state transducer (FST) representation whose transition weights can be trained using an EM algorithm for finite-state networks. This paper explains the modeling approach we use and the details of its realization. We demonstrate the benefits and weaknesses of the approach both conceptually and empirically using the recognizer for our
jupiter weather information system. Our experiments demonstrate that the use of phonological rewrite rules within our system achieves word error rate reductions between 4% and 9% over different test sets when compared against a system using no phonological rewrite rules.</description><identifier>ISSN: 0167-6393</identifier><identifier>EISSN: 1872-7182</identifier><identifier>DOI: 10.1016/j.specom.2005.03.004</identifier><identifier>CODEN: SCOMDH</identifier><language>eng</language><publisher>Amsterdam: Elsevier B.V</publisher><subject>Applied sciences ; Exact sciences and technology ; Information, signal and communications theory ; Signal processing ; Speech processing ; Telecommunications and information theory</subject><ispartof>Speech communication, 2005-06, Vol.46 (2), p.189-203</ispartof><rights>2005 Elsevier B.V.</rights><rights>2005 INIST-CNRS</rights><lds50>peer_reviewed</lds50><oa>free_for_read</oa><woscitedreferencessubscribed>false</woscitedreferencessubscribed><citedby>FETCH-LOGICAL-c475t-320e205d7154ad2bf14267ceaf1cb02aa0cdaf3f9d1ff2ce346924f5e98cca533</citedby><cites>FETCH-LOGICAL-c475t-320e205d7154ad2bf14267ceaf1cb02aa0cdaf3f9d1ff2ce346924f5e98cca533</cites></display><links><openurl>$$Topenurl_article</openurl><openurlfulltext>$$Topenurlfull_article</openurlfulltext><thumbnail>$$Tsyndetics_thumb_exl</thumbnail><link.rule.ids>309,310,314,776,780,785,786,23910,23911,25119,27903,27904,31249</link.rule.ids><backlink>$$Uhttp://pascal-francis.inist.fr/vibad/index.php?action=getRecordDetail&idt=16928893$$DView record in Pascal Francis$$Hfree_for_read</backlink></links><search><creatorcontrib>Hazen, Timothy J.</creatorcontrib><creatorcontrib>Hetherington, I. Lee</creatorcontrib><creatorcontrib>Shu, Han</creatorcontrib><creatorcontrib>Livescu, Karen</creatorcontrib><title>Pronunciation modeling using a finite-state transducer representation</title><title>Speech communication</title><description>The MIT
summit speech recognition system models pronunciation using a phonemic baseform dictionary along with rewrite rules for modeling phonological variation and multi-word reductions. Each pronunciation component is encoded within a finite-state transducer (FST) representation whose transition weights can be trained using an EM algorithm for finite-state networks. This paper explains the modeling approach we use and the details of its realization. We demonstrate the benefits and weaknesses of the approach both conceptually and empirically using the recognizer for our
jupiter weather information system. Our experiments demonstrate that the use of phonological rewrite rules within our system achieves word error rate reductions between 4% and 9% over different test sets when compared against a system using no phonological rewrite rules.</description><subject>Applied sciences</subject><subject>Exact sciences and technology</subject><subject>Information, signal and communications theory</subject><subject>Signal processing</subject><subject>Speech processing</subject><subject>Telecommunications and information theory</subject><issn>0167-6393</issn><issn>1872-7182</issn><fulltext>true</fulltext><rsrctype>article</rsrctype><creationdate>2005</creationdate><recordtype>article</recordtype><sourceid>7T9</sourceid><recordid>eNqNkUtLxDAUhYMoOI7-Axfd6K41jz7SjSDD-IABXeg6ZG5vJEObjkkq-O9treBO3dy7-c653HMIOWc0Y5SVV7ss7BH6LuOUFhkVGaX5AVkwWfG0YpIfksWIVWkpanFMTkLY0ZGQki_I-sn3bnBgdbS9S7q-wda612QI09SJsc5GTEPUEZPotQvNAOgTj3uPAV380p2SI6PbgGffe0lebtfPq_t083j3sLrZpJBXRUwFp8hp0VSsyHXDt4blvKwAtWGwpVxrCo02wtQNM4YDiryseW4KrCWALoRYksvZd-_7twFDVJ0NgG2rHfZDULIoBaMV-wcohCgr-SfI64LLMp_AfAbB9yF4NGrvbaf9h2JUTS2onZpbUFMLigo1ZjzKLr79dQDdmjFBsOFHOz4oZT19dj1zOMb3btGrABYdYGM9QlRNb38_9AkRqKCb</recordid><startdate>20050601</startdate><enddate>20050601</enddate><creator>Hazen, Timothy J.</creator><creator>Hetherington, I. Lee</creator><creator>Shu, Han</creator><creator>Livescu, Karen</creator><general>Elsevier B.V</general><general>Elsevier</general><scope>IQODW</scope><scope>AAYXX</scope><scope>CITATION</scope><scope>7SC</scope><scope>7SP</scope><scope>8FD</scope><scope>JQ2</scope><scope>L7M</scope><scope>L~C</scope><scope>L~D</scope><scope>8BM</scope><scope>7T9</scope></search><sort><creationdate>20050601</creationdate><title>Pronunciation modeling using a finite-state transducer representation</title><author>Hazen, Timothy J. ; Hetherington, I. Lee ; Shu, Han ; Livescu, Karen</author></sort><facets><frbrtype>5</frbrtype><frbrgroupid>cdi_FETCH-LOGICAL-c475t-320e205d7154ad2bf14267ceaf1cb02aa0cdaf3f9d1ff2ce346924f5e98cca533</frbrgroupid><rsrctype>articles</rsrctype><prefilter>articles</prefilter><language>eng</language><creationdate>2005</creationdate><topic>Applied sciences</topic><topic>Exact sciences and technology</topic><topic>Information, signal and communications theory</topic><topic>Signal processing</topic><topic>Speech processing</topic><topic>Telecommunications and information theory</topic><toplevel>peer_reviewed</toplevel><toplevel>online_resources</toplevel><creatorcontrib>Hazen, Timothy J.</creatorcontrib><creatorcontrib>Hetherington, I. Lee</creatorcontrib><creatorcontrib>Shu, Han</creatorcontrib><creatorcontrib>Livescu, Karen</creatorcontrib><collection>Pascal-Francis</collection><collection>CrossRef</collection><collection>Computer and Information Systems Abstracts</collection><collection>Electronics & Communications Abstracts</collection><collection>Technology Research Database</collection><collection>ProQuest Computer Science Collection</collection><collection>Advanced Technologies Database with Aerospace</collection><collection>Computer and Information Systems Abstracts – Academic</collection><collection>Computer and Information Systems Abstracts Professional</collection><collection>ComDisDome</collection><collection>Linguistics and Language Behavior Abstracts (LLBA)</collection><jtitle>Speech communication</jtitle></facets><delivery><delcategory>Remote Search Resource</delcategory><fulltext>fulltext</fulltext></delivery><addata><au>Hazen, Timothy J.</au><au>Hetherington, I. Lee</au><au>Shu, Han</au><au>Livescu, Karen</au><format>journal</format><genre>article</genre><ristype>JOUR</ristype><atitle>Pronunciation modeling using a finite-state transducer representation</atitle><jtitle>Speech communication</jtitle><date>2005-06-01</date><risdate>2005</risdate><volume>46</volume><issue>2</issue><spage>189</spage><epage>203</epage><pages>189-203</pages><issn>0167-6393</issn><eissn>1872-7182</eissn><coden>SCOMDH</coden><abstract>The MIT
summit speech recognition system models pronunciation using a phonemic baseform dictionary along with rewrite rules for modeling phonological variation and multi-word reductions. Each pronunciation component is encoded within a finite-state transducer (FST) representation whose transition weights can be trained using an EM algorithm for finite-state networks. This paper explains the modeling approach we use and the details of its realization. We demonstrate the benefits and weaknesses of the approach both conceptually and empirically using the recognizer for our
jupiter weather information system. Our experiments demonstrate that the use of phonological rewrite rules within our system achieves word error rate reductions between 4% and 9% over different test sets when compared against a system using no phonological rewrite rules.</abstract><cop>Amsterdam</cop><pub>Elsevier B.V</pub><doi>10.1016/j.specom.2005.03.004</doi><tpages>15</tpages><oa>free_for_read</oa></addata></record> |
fulltext | fulltext |
identifier | ISSN: 0167-6393 |
ispartof | Speech communication, 2005-06, Vol.46 (2), p.189-203 |
issn | 0167-6393 1872-7182 |
language | eng |
recordid | cdi_proquest_miscellaneous_85631071 |
source | Elsevier; Linguistics and Language Behavior Abstracts (LLBA) |
subjects | Applied sciences Exact sciences and technology Information, signal and communications theory Signal processing Speech processing Telecommunications and information theory |
title | Pronunciation modeling using a finite-state transducer representation |
url | http://sfxeu10.hosted.exlibrisgroup.com/loughborough?ctx_ver=Z39.88-2004&ctx_enc=info:ofi/enc:UTF-8&ctx_tim=2025-01-23T13%3A00%3A35IST&url_ver=Z39.88-2004&url_ctx_fmt=infofi/fmt:kev:mtx:ctx&rfr_id=info:sid/primo.exlibrisgroup.com:primo3-Article-proquest_cross&rft_val_fmt=info:ofi/fmt:kev:mtx:journal&rft.genre=article&rft.atitle=Pronunciation%20modeling%20using%20a%20finite-state%20transducer%20representation&rft.jtitle=Speech%20communication&rft.au=Hazen,%20Timothy%20J.&rft.date=2005-06-01&rft.volume=46&rft.issue=2&rft.spage=189&rft.epage=203&rft.pages=189-203&rft.issn=0167-6393&rft.eissn=1872-7182&rft.coden=SCOMDH&rft_id=info:doi/10.1016/j.specom.2005.03.004&rft_dat=%3Cproquest_cross%3E85631071%3C/proquest_cross%3E%3Cgrp_id%3Ecdi_FETCH-LOGICAL-c475t-320e205d7154ad2bf14267ceaf1cb02aa0cdaf3f9d1ff2ce346924f5e98cca533%3C/grp_id%3E%3Coa%3E%3C/oa%3E%3Curl%3E%3C/url%3E&rft_id=info:oai/&rft_pqid=29528648&rft_id=info:pmid/&rfr_iscdi=true |