Loading…

A unified framework of HMM adaptation with joint compensation of additive and convolutive distortions

In this paper, we present our recent development of a model-domain environment robust adaptation algorithm, which demonstrates high performance in the standard Aurora 2 speech recognition task. The algorithm consists of two main steps. First, the noise and channel parameters are estimated using mult...

Full description

Saved in:

Bibliographic Details
Published in:	Computer speech & language 2009-07, Vol.23 (3), p.389-405
Main Authors:	Li, Jinyu, Deng, Li, Yu, Dong, Gong, Yifan, Acero, Alex
Format:	Article
Language:	English
Subjects:	Additive and convolutive distortions Applied linguistics Computational linguistics Joint compensation Linguistics Phase-sensitive distortion model Robust ASR Vector Taylor series
Citations:	Items that this one cites Items that cite this one
Online Access:	Get full text
Tags:	Add Tag No Tags, Be the first to tag this record!

cited_by	cdi_FETCH-LOGICAL-c389t-ea7e6decd66ecb0caaed3c5600a493ac90bdd56a5019a2dcbab380363355d1463
cites	cdi_FETCH-LOGICAL-c389t-ea7e6decd66ecb0caaed3c5600a493ac90bdd56a5019a2dcbab380363355d1463
container_end_page	405
container_issue	3
container_start_page	389
container_title	Computer speech & language
container_volume	23
creator	Li, Jinyu Deng, Li Yu, Dong Gong, Yifan Acero, Alex
description	In this paper, we present our recent development of a model-domain environment robust adaptation algorithm, which demonstrates high performance in the standard Aurora 2 speech recognition task. The algorithm consists of two main steps. First, the noise and channel parameters are estimated using multi-sources of information including a nonlinear environment-distortion model in the cepstral domain, the posterior probabilities of all the Gaussians in speech recognizer, and truncated vector Taylor series (VTS) approximation. Second, the estimated noise and channel parameters are used to adapt the static and dynamic portions (delta and delta–delta) of the HMM means and variances. This two-step algorithm enables joint compensation of both additive and convolutive distortions (JAC). The hallmark of our new approach is the use of a nonlinear, phase-sensitive model of acoustic distortion that captures phase asynchrony between clean speech and the mixing noise. In the experimental evaluation using the standard Aurora 2 task, the proposed Phase-JAC/VTS algorithm achieves 93.32% word accuracy using the clean-trained complex HMM backend as the baseline system for the unsupervised model adaptation. This represents high recognition performance on this task without discriminative training of the HMM system. The experimental results show that the phase term, which was missing in all previous HMM adaptation work, contributes significantly to the achieved high recognition accuracy.
doi_str_mv	10.1016/j.csl.2009.02.001
format	article
fullrecord	<record><control><sourceid>proquest_cross</sourceid><recordid>TN_cdi_proquest_miscellaneous_85705524</recordid><sourceformat>XML</sourceformat><sourcesystem>PC</sourcesystem><els_id>S0885230809000035</els_id><sourcerecordid>85705524</sourcerecordid><originalsourceid>FETCH-LOGICAL-c389t-ea7e6decd66ecb0caaed3c5600a493ac90bdd56a5019a2dcbab380363355d1463</originalsourceid><addsrcrecordid>eNqNkMFOGzEQhi3USk0pD9CbL-1tl_F67eyqJ4TaggTiAmdrYs8Kpxs7tZ0g3h6HII5VT5bt7_9H8zH2VUArQOjzdWvz3HYAYwtdCyBO2ELAqJpBavmBLWAYVNNJGD6xzzmvAUCrfrlgdMF3wU-eHJ8Sbugppj88Tvzq9pajw23B4mPgT7488nX0oXAbN1sK-fheSXTOF78njsHVz7CP8-717nwuMR2w_IV9nHDOdPZ2nrKHXz_vL6-am7vf15cXN42Vw1gawiVpR9ZpTXYFFpGctEoDYD9KtCOsnFMaFYgRO2dXuJID1AWlUk70Wp6y78febYp_d5SL2fhsaZ4xUNxlM6glKNX1_wFK1Y_DoVEcQZtizokms01-g-nZCDAH82ZtqnlzMG-gM9V8zXx7K8dsca5eg_X5PdiJHkB2qnI_jhxVJXtPyWTrKVhyPpEtxkX_jykvErqa1g</addsrcrecordid><sourcetype>Aggregation Database</sourcetype><iscdi>true</iscdi><recordtype>article</recordtype><pqid>85354986</pqid></control><display><type>article</type><title>A unified framework of HMM adaptation with joint compensation of additive and convolutive distortions</title><source>ScienceDirect Freedom Collection</source><source>Linguistics and Language Behavior Abstracts (LLBA)</source><creator>Li, Jinyu ; Deng, Li ; Yu, Dong ; Gong, Yifan ; Acero, Alex</creator><creatorcontrib>Li, Jinyu ; Deng, Li ; Yu, Dong ; Gong, Yifan ; Acero, Alex</creatorcontrib><description>In this paper, we present our recent development of a model-domain environment robust adaptation algorithm, which demonstrates high performance in the standard Aurora 2 speech recognition task. The algorithm consists of two main steps. First, the noise and channel parameters are estimated using multi-sources of information including a nonlinear environment-distortion model in the cepstral domain, the posterior probabilities of all the Gaussians in speech recognizer, and truncated vector Taylor series (VTS) approximation. Second, the estimated noise and channel parameters are used to adapt the static and dynamic portions (delta and delta–delta) of the HMM means and variances. This two-step algorithm enables joint compensation of both additive and convolutive distortions (JAC). The hallmark of our new approach is the use of a nonlinear, phase-sensitive model of acoustic distortion that captures phase asynchrony between clean speech and the mixing noise. In the experimental evaluation using the standard Aurora 2 task, the proposed Phase-JAC/VTS algorithm achieves 93.32% word accuracy using the clean-trained complex HMM backend as the baseline system for the unsupervised model adaptation. This represents high recognition performance on this task without discriminative training of the HMM system. The experimental results show that the phase term, which was missing in all previous HMM adaptation work, contributes significantly to the achieved high recognition accuracy.</description><identifier>ISSN: 0885-2308</identifier><identifier>EISSN: 1095-8363</identifier><identifier>DOI: 10.1016/j.csl.2009.02.001</identifier><identifier>CODEN: CSPLEO</identifier><language>eng</language><publisher>Kidlington: Elsevier Ltd</publisher><subject>Additive and convolutive distortions ; Applied linguistics ; Computational linguistics ; Joint compensation ; Linguistics ; Phase-sensitive distortion model ; Robust ASR ; Vector Taylor series</subject><ispartof>Computer speech & language, 2009-07, Vol.23 (3), p.389-405</ispartof><rights>2009 Elsevier Ltd</rights><rights>2009 INIST-CNRS</rights><lds50>peer_reviewed</lds50><woscitedreferencessubscribed>false</woscitedreferencessubscribed><citedby>FETCH-LOGICAL-c389t-ea7e6decd66ecb0caaed3c5600a493ac90bdd56a5019a2dcbab380363355d1463</citedby><cites>FETCH-LOGICAL-c389t-ea7e6decd66ecb0caaed3c5600a493ac90bdd56a5019a2dcbab380363355d1463</cites></display><links><openurl>$$Topenurl_article</openurl><openurlfulltext>$$Topenurlfull_article</openurlfulltext><thumbnail>$$Tsyndetics_thumb_exl</thumbnail><link.rule.ids>314,780,784,27924,27925,31270</link.rule.ids><backlink>$$Uhttp://pascal-francis.inist.fr/vibad/index.php?action=getRecordDetail&idt=21400325$$DView record in Pascal Francis$$Hfree_for_read</backlink></links><search><creatorcontrib>Li, Jinyu</creatorcontrib><creatorcontrib>Deng, Li</creatorcontrib><creatorcontrib>Yu, Dong</creatorcontrib><creatorcontrib>Gong, Yifan</creatorcontrib><creatorcontrib>Acero, Alex</creatorcontrib><title>A unified framework of HMM adaptation with joint compensation of additive and convolutive distortions</title><title>Computer speech & language</title><description>In this paper, we present our recent development of a model-domain environment robust adaptation algorithm, which demonstrates high performance in the standard Aurora 2 speech recognition task. The algorithm consists of two main steps. First, the noise and channel parameters are estimated using multi-sources of information including a nonlinear environment-distortion model in the cepstral domain, the posterior probabilities of all the Gaussians in speech recognizer, and truncated vector Taylor series (VTS) approximation. Second, the estimated noise and channel parameters are used to adapt the static and dynamic portions (delta and delta–delta) of the HMM means and variances. This two-step algorithm enables joint compensation of both additive and convolutive distortions (JAC). The hallmark of our new approach is the use of a nonlinear, phase-sensitive model of acoustic distortion that captures phase asynchrony between clean speech and the mixing noise. In the experimental evaluation using the standard Aurora 2 task, the proposed Phase-JAC/VTS algorithm achieves 93.32% word accuracy using the clean-trained complex HMM backend as the baseline system for the unsupervised model adaptation. This represents high recognition performance on this task without discriminative training of the HMM system. The experimental results show that the phase term, which was missing in all previous HMM adaptation work, contributes significantly to the achieved high recognition accuracy.</description><subject>Additive and convolutive distortions</subject><subject>Applied linguistics</subject><subject>Computational linguistics</subject><subject>Joint compensation</subject><subject>Linguistics</subject><subject>Phase-sensitive distortion model</subject><subject>Robust ASR</subject><subject>Vector Taylor series</subject><issn>0885-2308</issn><issn>1095-8363</issn><fulltext>true</fulltext><rsrctype>article</rsrctype><creationdate>2009</creationdate><recordtype>article</recordtype><sourceid>7T9</sourceid><recordid>eNqNkMFOGzEQhi3USk0pD9CbL-1tl_F67eyqJ4TaggTiAmdrYs8Kpxs7tZ0g3h6HII5VT5bt7_9H8zH2VUArQOjzdWvz3HYAYwtdCyBO2ELAqJpBavmBLWAYVNNJGD6xzzmvAUCrfrlgdMF3wU-eHJ8Sbugppj88Tvzq9pajw23B4mPgT7488nX0oXAbN1sK-fheSXTOF78njsHVz7CP8-717nwuMR2w_IV9nHDOdPZ2nrKHXz_vL6-am7vf15cXN42Vw1gawiVpR9ZpTXYFFpGctEoDYD9KtCOsnFMaFYgRO2dXuJID1AWlUk70Wp6y78febYp_d5SL2fhsaZ4xUNxlM6glKNX1_wFK1Y_DoVEcQZtizokms01-g-nZCDAH82ZtqnlzMG-gM9V8zXx7K8dsca5eg_X5PdiJHkB2qnI_jhxVJXtPyWTrKVhyPpEtxkX_jykvErqa1g</recordid><startdate>20090701</startdate><enddate>20090701</enddate><creator>Li, Jinyu</creator><creator>Deng, Li</creator><creator>Yu, Dong</creator><creator>Gong, Yifan</creator><creator>Acero, Alex</creator><general>Elsevier Ltd</general><general>Elsevier</general><scope>IQODW</scope><scope>AAYXX</scope><scope>CITATION</scope><scope>8BM</scope><scope>7T9</scope></search><sort><creationdate>20090701</creationdate><title>A unified framework of HMM adaptation with joint compensation of additive and convolutive distortions</title><author>Li, Jinyu ; Deng, Li ; Yu, Dong ; Gong, Yifan ; Acero, Alex</author></sort><facets><frbrtype>5</frbrtype><frbrgroupid>cdi_FETCH-LOGICAL-c389t-ea7e6decd66ecb0caaed3c5600a493ac90bdd56a5019a2dcbab380363355d1463</frbrgroupid><rsrctype>articles</rsrctype><prefilter>articles</prefilter><language>eng</language><creationdate>2009</creationdate><topic>Additive and convolutive distortions</topic><topic>Applied linguistics</topic><topic>Computational linguistics</topic><topic>Joint compensation</topic><topic>Linguistics</topic><topic>Phase-sensitive distortion model</topic><topic>Robust ASR</topic><topic>Vector Taylor series</topic><toplevel>peer_reviewed</toplevel><toplevel>online_resources</toplevel><creatorcontrib>Li, Jinyu</creatorcontrib><creatorcontrib>Deng, Li</creatorcontrib><creatorcontrib>Yu, Dong</creatorcontrib><creatorcontrib>Gong, Yifan</creatorcontrib><creatorcontrib>Acero, Alex</creatorcontrib><collection>Pascal-Francis</collection><collection>CrossRef</collection><collection>ComDisDome</collection><collection>Linguistics and Language Behavior Abstracts (LLBA)</collection><jtitle>Computer speech & language</jtitle></facets><delivery><delcategory>Remote Search Resource</delcategory><fulltext>fulltext</fulltext></delivery><addata><au>Li, Jinyu</au><au>Deng, Li</au><au>Yu, Dong</au><au>Gong, Yifan</au><au>Acero, Alex</au><format>journal</format><genre>article</genre><ristype>JOUR</ristype><atitle>A unified framework of HMM adaptation with joint compensation of additive and convolutive distortions</atitle><jtitle>Computer speech & language</jtitle><date>2009-07-01</date><risdate>2009</risdate><volume>23</volume><issue>3</issue><spage>389</spage><epage>405</epage><pages>389-405</pages><issn>0885-2308</issn><eissn>1095-8363</eissn><coden>CSPLEO</coden><abstract>In this paper, we present our recent development of a model-domain environment robust adaptation algorithm, which demonstrates high performance in the standard Aurora 2 speech recognition task. The algorithm consists of two main steps. First, the noise and channel parameters are estimated using multi-sources of information including a nonlinear environment-distortion model in the cepstral domain, the posterior probabilities of all the Gaussians in speech recognizer, and truncated vector Taylor series (VTS) approximation. Second, the estimated noise and channel parameters are used to adapt the static and dynamic portions (delta and delta–delta) of the HMM means and variances. This two-step algorithm enables joint compensation of both additive and convolutive distortions (JAC). The hallmark of our new approach is the use of a nonlinear, phase-sensitive model of acoustic distortion that captures phase asynchrony between clean speech and the mixing noise. In the experimental evaluation using the standard Aurora 2 task, the proposed Phase-JAC/VTS algorithm achieves 93.32% word accuracy using the clean-trained complex HMM backend as the baseline system for the unsupervised model adaptation. This represents high recognition performance on this task without discriminative training of the HMM system. The experimental results show that the phase term, which was missing in all previous HMM adaptation work, contributes significantly to the achieved high recognition accuracy.</abstract><cop>Kidlington</cop><pub>Elsevier Ltd</pub><doi>10.1016/j.csl.2009.02.001</doi><tpages>17</tpages></addata></record>
fulltext	fulltext
identifier	ISSN: 0885-2308
ispartof	Computer speech & language, 2009-07, Vol.23 (3), p.389-405
issn	0885-2308 1095-8363
language	eng
recordid	cdi_proquest_miscellaneous_85705524
source	ScienceDirect Freedom Collection; Linguistics and Language Behavior Abstracts (LLBA)
subjects	Additive and convolutive distortions Applied linguistics Computational linguistics Joint compensation Linguistics Phase-sensitive distortion model Robust ASR Vector Taylor series
title	A unified framework of HMM adaptation with joint compensation of additive and convolutive distortions
url	http://sfxeu10.hosted.exlibrisgroup.com/loughborough?ctx_ver=Z39.88-2004&ctx_enc=info:ofi/enc:UTF-8&ctx_tim=2025-01-07T13%3A43%3A29IST&url_ver=Z39.88-2004&url_ctx_fmt=infofi/fmt:kev:mtx:ctx&rfr_id=info:sid/primo.exlibrisgroup.com:primo3-Article-proquest_cross&rft_val_fmt=info:ofi/fmt:kev:mtx:journal&rft.genre=article&rft.atitle=A%20unified%20framework%20of%20HMM%20adaptation%20with%20joint%20compensation%20of%20additive%20and%20convolutive%20distortions&rft.jtitle=Computer%20speech%20&%20language&rft.au=Li,%20Jinyu&rft.date=2009-07-01&rft.volume=23&rft.issue=3&rft.spage=389&rft.epage=405&rft.pages=389-405&rft.issn=0885-2308&rft.eissn=1095-8363&rft.coden=CSPLEO&rft_id=info:doi/10.1016/j.csl.2009.02.001&rft_dat=%3Cproquest_cross%3E85705524%3C/proquest_cross%3E%3Cgrp_id%3Ecdi_FETCH-LOGICAL-c389t-ea7e6decd66ecb0caaed3c5600a493ac90bdd56a5019a2dcbab380363355d1463%3C/grp_id%3E%3Coa%3E%3C/oa%3E%3Curl%3E%3C/url%3E&rft_id=info:oai/&rft_pqid=85354986&rft_id=info:pmid/&rfr_iscdi=true