Loading…

Active learning and semi-supervised learning for speech recognition: A unified framework using the global entropy reduction maximization criterion

We propose a unified global entropy reduction maximization (GERM) framework for active learning and semi-supervised learning for speech recognition. Active learning aims to select a limited subset of utterances for transcribing from a large amount of un-transcribed utterances, while semi-supervised...

Full description

Saved in:

Bibliographic Details
Published in:	Computer speech & language 2010-07, Vol.24 (3), p.433-444
Main Authors:	Yu, Dong, Varadarajan, Balakrishnan, Deng, Li, Acero, Alex
Format:	Article
Language:	English
Subjects:	Acoustic model Active learning Collective information Confidence Entropy reduction Lattice Semi-supervised learning
Citations:	Items that this one cites Items that cite this one
Online Access:	Get full text
Tags:	Add Tag No Tags, Be the first to tag this record!

cited_by	cdi_FETCH-LOGICAL-c427t-5a376696779cf7c5ab059af32b6d9981b533212f8e8e4b5d7eed0f27fbf526193
cites	cdi_FETCH-LOGICAL-c427t-5a376696779cf7c5ab059af32b6d9981b533212f8e8e4b5d7eed0f27fbf526193
container_end_page	444
container_issue	3
container_start_page	433
container_title	Computer speech & language
container_volume	24
creator	Yu, Dong Varadarajan, Balakrishnan Deng, Li Acero, Alex
description	We propose a unified global entropy reduction maximization (GERM) framework for active learning and semi-supervised learning for speech recognition. Active learning aims to select a limited subset of utterances for transcribing from a large amount of un-transcribed utterances, while semi-supervised learning addresses the problem of selecting right transcriptions for un-transcribed utterances, so that the accuracy of the automatic speech recognition system can be maximized. We show that both the traditional confidence-based active learning and semi-supervised learning approaches can be improved by maximizing the lattice entropy reduction over the whole dataset. We introduce our criterion and framework, show how the criterion can be simplified and approximated, and describe how these approaches can be combined. We demonstrate the effectiveness of our new framework and algorithm with directory assistance data collected under the real usage scenarios and show that our GERM based active learning and semi-supervised learning algorithms consistently outperform the confidence-based counterparts by a significant margin. Using our new active learning algorithm cuts the number of utterances needed for transcribing by 50% to achieve the same recognition accuracy obtained using the confidence-based active learning approach, and by 60% compared to the random sampling approach. Using our new semi-supervised algorithm we can determine the cutoff point in determining which utterance-transcription pair to use in a principled way by demonstrating that the point it finds is very close to the achievable peak point.
doi_str_mv	10.1016/j.csl.2009.03.004
format	article
fullrecord	<record><control><sourceid>proquest_cross</sourceid><recordid>TN_cdi_proquest_miscellaneous_853229806</recordid><sourceformat>XML</sourceformat><sourcesystem>PC</sourcesystem><els_id>S0885230809000187</els_id><sourcerecordid>853229806</sourcerecordid><originalsourceid>FETCH-LOGICAL-c427t-5a376696779cf7c5ab059af32b6d9981b533212f8e8e4b5d7eed0f27fbf526193</originalsourceid><addsrcrecordid>eNqFkU1v1DAQhiNEJZbSH8DNN04JY3v9ETitKr6kSlzo2XKc8dZLEgc7WVp-Rn8xXrYSNzh5rHmekWbeqnpNoaFA5dtD4_LQMIC2Ad4AbJ9VGwqtqDWX_Hm1Aa1FzTjoF9XLnA8AIMVWbarHnVvCEcmANk1h2hM79STjGOq8zpiOIWP_t-ljInlGdHckoYv7KSwhTu_IjqxT8KGgPtkRf8b0naz5ZCx3SPZD7OxAcFpSnB-K2a_u5JHR3ocx_LJ_Pi6FBVOpXlUX3g4Zr57ey-r244dv15_rm6-fvlzvbmq3ZWqpheVKylYq1TqvnLAdiNZ6zjrZt62mneCcUeY1atx2oleIPXimfOcFk7Tll9Wb89w5xR8r5sWMITscBjthXLPRgjPWapD_JZXgmgHXUEh6Jl2KOSf0Zk5htOnBUDCnoMzBlKDMKSgD3JSgivP-7GBZ9hgwmewCTg77UI68mD6Gf9i_ATrVnyM</addsrcrecordid><sourcetype>Aggregation Database</sourcetype><iscdi>true</iscdi><recordtype>article</recordtype><pqid>753820380</pqid></control><display><type>article</type><title>Active learning and semi-supervised learning for speech recognition: A unified framework using the global entropy reduction maximization criterion</title><source>Elsevier</source><source>Linguistics and Language Behavior Abstracts (LLBA)</source><creator>Yu, Dong ; Varadarajan, Balakrishnan ; Deng, Li ; Acero, Alex</creator><creatorcontrib>Yu, Dong ; Varadarajan, Balakrishnan ; Deng, Li ; Acero, Alex</creatorcontrib><description>We propose a unified global entropy reduction maximization (GERM) framework for active learning and semi-supervised learning for speech recognition. Active learning aims to select a limited subset of utterances for transcribing from a large amount of un-transcribed utterances, while semi-supervised learning addresses the problem of selecting right transcriptions for un-transcribed utterances, so that the accuracy of the automatic speech recognition system can be maximized. We show that both the traditional confidence-based active learning and semi-supervised learning approaches can be improved by maximizing the lattice entropy reduction over the whole dataset. We introduce our criterion and framework, show how the criterion can be simplified and approximated, and describe how these approaches can be combined. We demonstrate the effectiveness of our new framework and algorithm with directory assistance data collected under the real usage scenarios and show that our GERM based active learning and semi-supervised learning algorithms consistently outperform the confidence-based counterparts by a significant margin. Using our new active learning algorithm cuts the number of utterances needed for transcribing by 50% to achieve the same recognition accuracy obtained using the confidence-based active learning approach, and by 60% compared to the random sampling approach. Using our new semi-supervised algorithm we can determine the cutoff point in determining which utterance-transcription pair to use in a principled way by demonstrating that the point it finds is very close to the achievable peak point.</description><identifier>ISSN: 0885-2308</identifier><identifier>EISSN: 1095-8363</identifier><identifier>DOI: 10.1016/j.csl.2009.03.004</identifier><identifier>CODEN: CSPLEO</identifier><language>eng</language><publisher>Elsevier Ltd</publisher><subject>Acoustic model ; Active learning ; Collective information ; Confidence ; Entropy reduction ; Lattice ; Semi-supervised learning</subject><ispartof>Computer speech & language, 2010-07, Vol.24 (3), p.433-444</ispartof><rights>2009 Elsevier Ltd</rights><lds50>peer_reviewed</lds50><woscitedreferencessubscribed>false</woscitedreferencessubscribed><citedby>FETCH-LOGICAL-c427t-5a376696779cf7c5ab059af32b6d9981b533212f8e8e4b5d7eed0f27fbf526193</citedby><cites>FETCH-LOGICAL-c427t-5a376696779cf7c5ab059af32b6d9981b533212f8e8e4b5d7eed0f27fbf526193</cites></display><links><openurl>$$Topenurl_article</openurl><openurlfulltext>$$Topenurlfull_article</openurlfulltext><thumbnail>$$Tsyndetics_thumb_exl</thumbnail><link.rule.ids>314,780,784,27924,27925,31270</link.rule.ids></links><search><creatorcontrib>Yu, Dong</creatorcontrib><creatorcontrib>Varadarajan, Balakrishnan</creatorcontrib><creatorcontrib>Deng, Li</creatorcontrib><creatorcontrib>Acero, Alex</creatorcontrib><title>Active learning and semi-supervised learning for speech recognition: A unified framework using the global entropy reduction maximization criterion</title><title>Computer speech & language</title><description>We propose a unified global entropy reduction maximization (GERM) framework for active learning and semi-supervised learning for speech recognition. Active learning aims to select a limited subset of utterances for transcribing from a large amount of un-transcribed utterances, while semi-supervised learning addresses the problem of selecting right transcriptions for un-transcribed utterances, so that the accuracy of the automatic speech recognition system can be maximized. We show that both the traditional confidence-based active learning and semi-supervised learning approaches can be improved by maximizing the lattice entropy reduction over the whole dataset. We introduce our criterion and framework, show how the criterion can be simplified and approximated, and describe how these approaches can be combined. We demonstrate the effectiveness of our new framework and algorithm with directory assistance data collected under the real usage scenarios and show that our GERM based active learning and semi-supervised learning algorithms consistently outperform the confidence-based counterparts by a significant margin. Using our new active learning algorithm cuts the number of utterances needed for transcribing by 50% to achieve the same recognition accuracy obtained using the confidence-based active learning approach, and by 60% compared to the random sampling approach. Using our new semi-supervised algorithm we can determine the cutoff point in determining which utterance-transcription pair to use in a principled way by demonstrating that the point it finds is very close to the achievable peak point.</description><subject>Acoustic model</subject><subject>Active learning</subject><subject>Collective information</subject><subject>Confidence</subject><subject>Entropy reduction</subject><subject>Lattice</subject><subject>Semi-supervised learning</subject><issn>0885-2308</issn><issn>1095-8363</issn><fulltext>true</fulltext><rsrctype>article</rsrctype><creationdate>2010</creationdate><recordtype>article</recordtype><sourceid>7T9</sourceid><recordid>eNqFkU1v1DAQhiNEJZbSH8DNN04JY3v9ETitKr6kSlzo2XKc8dZLEgc7WVp-Rn8xXrYSNzh5rHmekWbeqnpNoaFA5dtD4_LQMIC2Ad4AbJ9VGwqtqDWX_Hm1Aa1FzTjoF9XLnA8AIMVWbarHnVvCEcmANk1h2hM79STjGOq8zpiOIWP_t-ljInlGdHckoYv7KSwhTu_IjqxT8KGgPtkRf8b0naz5ZCx3SPZD7OxAcFpSnB-K2a_u5JHR3ocx_LJ_Pi6FBVOpXlUX3g4Zr57ey-r244dv15_rm6-fvlzvbmq3ZWqpheVKylYq1TqvnLAdiNZ6zjrZt62mneCcUeY1atx2oleIPXimfOcFk7Tll9Wb89w5xR8r5sWMITscBjthXLPRgjPWapD_JZXgmgHXUEh6Jl2KOSf0Zk5htOnBUDCnoMzBlKDMKSgD3JSgivP-7GBZ9hgwmewCTg77UI68mD6Gf9i_ATrVnyM</recordid><startdate>20100701</startdate><enddate>20100701</enddate><creator>Yu, Dong</creator><creator>Varadarajan, Balakrishnan</creator><creator>Deng, Li</creator><creator>Acero, Alex</creator><general>Elsevier Ltd</general><scope>AAYXX</scope><scope>CITATION</scope><scope>7T9</scope><scope>8BM</scope></search><sort><creationdate>20100701</creationdate><title>Active learning and semi-supervised learning for speech recognition: A unified framework using the global entropy reduction maximization criterion</title><author>Yu, Dong ; Varadarajan, Balakrishnan ; Deng, Li ; Acero, Alex</author></sort><facets><frbrtype>5</frbrtype><frbrgroupid>cdi_FETCH-LOGICAL-c427t-5a376696779cf7c5ab059af32b6d9981b533212f8e8e4b5d7eed0f27fbf526193</frbrgroupid><rsrctype>articles</rsrctype><prefilter>articles</prefilter><language>eng</language><creationdate>2010</creationdate><topic>Acoustic model</topic><topic>Active learning</topic><topic>Collective information</topic><topic>Confidence</topic><topic>Entropy reduction</topic><topic>Lattice</topic><topic>Semi-supervised learning</topic><toplevel>peer_reviewed</toplevel><toplevel>online_resources</toplevel><creatorcontrib>Yu, Dong</creatorcontrib><creatorcontrib>Varadarajan, Balakrishnan</creatorcontrib><creatorcontrib>Deng, Li</creatorcontrib><creatorcontrib>Acero, Alex</creatorcontrib><collection>CrossRef</collection><collection>Linguistics and Language Behavior Abstracts (LLBA)</collection><collection>ComDisDome</collection><jtitle>Computer speech & language</jtitle></facets><delivery><delcategory>Remote Search Resource</delcategory><fulltext>fulltext</fulltext></delivery><addata><au>Yu, Dong</au><au>Varadarajan, Balakrishnan</au><au>Deng, Li</au><au>Acero, Alex</au><format>journal</format><genre>article</genre><ristype>JOUR</ristype><atitle>Active learning and semi-supervised learning for speech recognition: A unified framework using the global entropy reduction maximization criterion</atitle><jtitle>Computer speech & language</jtitle><date>2010-07-01</date><risdate>2010</risdate><volume>24</volume><issue>3</issue><spage>433</spage><epage>444</epage><pages>433-444</pages><issn>0885-2308</issn><eissn>1095-8363</eissn><coden>CSPLEO</coden><abstract>We propose a unified global entropy reduction maximization (GERM) framework for active learning and semi-supervised learning for speech recognition. Active learning aims to select a limited subset of utterances for transcribing from a large amount of un-transcribed utterances, while semi-supervised learning addresses the problem of selecting right transcriptions for un-transcribed utterances, so that the accuracy of the automatic speech recognition system can be maximized. We show that both the traditional confidence-based active learning and semi-supervised learning approaches can be improved by maximizing the lattice entropy reduction over the whole dataset. We introduce our criterion and framework, show how the criterion can be simplified and approximated, and describe how these approaches can be combined. We demonstrate the effectiveness of our new framework and algorithm with directory assistance data collected under the real usage scenarios and show that our GERM based active learning and semi-supervised learning algorithms consistently outperform the confidence-based counterparts by a significant margin. Using our new active learning algorithm cuts the number of utterances needed for transcribing by 50% to achieve the same recognition accuracy obtained using the confidence-based active learning approach, and by 60% compared to the random sampling approach. Using our new semi-supervised algorithm we can determine the cutoff point in determining which utterance-transcription pair to use in a principled way by demonstrating that the point it finds is very close to the achievable peak point.</abstract><pub>Elsevier Ltd</pub><doi>10.1016/j.csl.2009.03.004</doi><tpages>12</tpages></addata></record>
fulltext	fulltext
identifier	ISSN: 0885-2308
ispartof	Computer speech & language, 2010-07, Vol.24 (3), p.433-444
issn	0885-2308 1095-8363
language	eng
recordid	cdi_proquest_miscellaneous_853229806
source	Elsevier; Linguistics and Language Behavior Abstracts (LLBA)
subjects	Acoustic model Active learning Collective information Confidence Entropy reduction Lattice Semi-supervised learning
title	Active learning and semi-supervised learning for speech recognition: A unified framework using the global entropy reduction maximization criterion
url	http://sfxeu10.hosted.exlibrisgroup.com/loughborough?ctx_ver=Z39.88-2004&ctx_enc=info:ofi/enc:UTF-8&ctx_tim=2025-01-07T13%3A40%3A53IST&url_ver=Z39.88-2004&url_ctx_fmt=infofi/fmt:kev:mtx:ctx&rfr_id=info:sid/primo.exlibrisgroup.com:primo3-Article-proquest_cross&rft_val_fmt=info:ofi/fmt:kev:mtx:journal&rft.genre=article&rft.atitle=Active%20learning%20and%20semi-supervised%20learning%20for%20speech%20recognition:%20A%20unified%20framework%20using%20the%20global%20entropy%20reduction%20maximization%20criterion&rft.jtitle=Computer%20speech%20&%20language&rft.au=Yu,%20Dong&rft.date=2010-07-01&rft.volume=24&rft.issue=3&rft.spage=433&rft.epage=444&rft.pages=433-444&rft.issn=0885-2308&rft.eissn=1095-8363&rft.coden=CSPLEO&rft_id=info:doi/10.1016/j.csl.2009.03.004&rft_dat=%3Cproquest_cross%3E853229806%3C/proquest_cross%3E%3Cgrp_id%3Ecdi_FETCH-LOGICAL-c427t-5a376696779cf7c5ab059af32b6d9981b533212f8e8e4b5d7eed0f27fbf526193%3C/grp_id%3E%3Coa%3E%3C/oa%3E%3Curl%3E%3C/url%3E&rft_id=info:oai/&rft_pqid=753820380&rft_id=info:pmid/&rfr_iscdi=true