Loading…

Logistic discriminative speech detectors using posterior SNR

We introduce an elegant and novel design for a speech detector which estimates the probability of the presence of speech in each time-frequency bin, as well as in each frame. The proposed system uses discriminative estimators based on logistic regression, and incorporates spectral and temporal corre...

Full description

Saved in:
Bibliographic Details
Main Authors: Surendran, A.C., Sukittanon, S., Platt, J.
Format: Conference Proceeding
Language:English
Subjects:
Online Access:Request full text
Tags: Add Tag
No Tags, Be the first to tag this record!
cited_by
cites
container_end_page 625
container_issue
container_start_page V
container_title
container_volume 5
creator Surendran, A.C.
Sukittanon, S.
Platt, J.
description We introduce an elegant and novel design for a speech detector which estimates the probability of the presence of speech in each time-frequency bin, as well as in each frame. The proposed system uses discriminative estimators based on logistic regression, and incorporates spectral and temporal correlations in the same framework. The detector is flexible enough to be configured in a single level or a "stacked" bilevel architecture depending on the needs of the application. An important part of the proposed design is the use of a new set of features: the normalized logarithm of the estimated posterior signal-to-noise ratio. These can be easily and automatically generated by tracking the noise spectrum online. We present results on the AURORA database to demonstrate that the overall design is simple, flexible and effective.
doi_str_mv 10.1109/ICASSP.2004.1327188
format conference_proceeding
fullrecord <record><control><sourceid>pascalfrancis_6IE</sourceid><recordid>TN_cdi_ieee_primary_1327188</recordid><sourceformat>XML</sourceformat><sourcesystem>PC</sourcesystem><ieee_id>1327188</ieee_id><sourcerecordid>17610985</sourcerecordid><originalsourceid>FETCH-LOGICAL-i958-1fbcc5b3b77a07b17288a316401177f0a204a8b37c062478d451b562e72b44573</originalsourceid><addsrcrecordid>eNpFkEtLw0AUhQcfYK39Bd1k4zL13nnk3oAbKb6gqJgu3JWZ6aSO1CZkouC_NxBBOHAW5-NwOELMERaIUF49Lm-q6mUhAfQClSRkPhITqajMsYS3YzEriWGQYs1anogJGgl5gbo8E-cpfQAAk-aJuF41u5j66LNtTL6Ln_Fg-_gdstSG4N-zbeiD75suZV8pHnZZ26Q-dLHpsurp9UKc1nafwuzPp2J9d7tePuSr5_th4iqPpeEca-e9ccoRWSCHJJmtwkIDIlENVoK27BR5KKQm3mqDzhQykHRaG1JTcTnWtjZ5u687e_Axbdphre1-NkjF8AmbgZuPXAwh_MfjP-oXzGBWRw</addsrcrecordid><sourcetype>Index Database</sourcetype><iscdi>true</iscdi><recordtype>conference_proceeding</recordtype></control><display><type>conference_proceeding</type><title>Logistic discriminative speech detectors using posterior SNR</title><source>IEEE Electronic Library (IEL) Conference Proceedings</source><creator>Surendran, A.C. ; Sukittanon, S. ; Platt, J.</creator><creatorcontrib>Surendran, A.C. ; Sukittanon, S. ; Platt, J.</creatorcontrib><description>We introduce an elegant and novel design for a speech detector which estimates the probability of the presence of speech in each time-frequency bin, as well as in each frame. The proposed system uses discriminative estimators based on logistic regression, and incorporates spectral and temporal correlations in the same framework. The detector is flexible enough to be configured in a single level or a "stacked" bilevel architecture depending on the needs of the application. An important part of the proposed design is the use of a new set of features: the normalized logarithm of the estimated posterior signal-to-noise ratio. These can be easily and automatically generated by tracking the noise spectrum online. We present results on the AURORA database to demonstrate that the overall design is simple, flexible and effective.</description><identifier>ISSN: 1520-6149</identifier><identifier>ISBN: 9780780384842</identifier><identifier>ISBN: 0780384849</identifier><identifier>EISSN: 2379-190X</identifier><identifier>DOI: 10.1109/ICASSP.2004.1327188</identifier><language>eng</language><publisher>Piscataway, N.J: IEEE</publisher><subject>Applied sciences ; Detectors ; Exact sciences and technology ; Information, signal and communications theory ; Logistics ; Noise generators ; Signal design ; Signal processing ; Signal to noise ratio ; Spatial databases ; Speech coding ; Speech enhancement ; Speech processing ; Telecommunications and information theory ; Testing ; Time frequency analysis</subject><ispartof>2004 IEEE International Conference on Acoustics, Speech, and Signal Processing, 2004, Vol.5, p.V-625</ispartof><rights>2006 INIST-CNRS</rights><oa>free_for_read</oa><woscitedreferencessubscribed>false</woscitedreferencessubscribed></display><links><openurl>$$Topenurl_article</openurl><openurlfulltext>$$Topenurlfull_article</openurlfulltext><thumbnail>$$Tsyndetics_thumb_exl</thumbnail><linktohtml>$$Uhttps://ieeexplore.ieee.org/document/1327188$$EHTML$$P50$$Gieee$$H</linktohtml><link.rule.ids>309,310,780,784,789,790,2058,4050,4051,27925,54555,54920,54932</link.rule.ids><linktorsrc>$$Uhttps://ieeexplore.ieee.org/document/1327188$$EView_record_in_IEEE$$FView_record_in_$$GIEEE</linktorsrc><backlink>$$Uhttp://pascal-francis.inist.fr/vibad/index.php?action=getRecordDetail&amp;idt=17610985$$DView record in Pascal Francis$$Hfree_for_read</backlink></links><search><creatorcontrib>Surendran, A.C.</creatorcontrib><creatorcontrib>Sukittanon, S.</creatorcontrib><creatorcontrib>Platt, J.</creatorcontrib><title>Logistic discriminative speech detectors using posterior SNR</title><title>2004 IEEE International Conference on Acoustics, Speech, and Signal Processing</title><addtitle>ICASSP</addtitle><description>We introduce an elegant and novel design for a speech detector which estimates the probability of the presence of speech in each time-frequency bin, as well as in each frame. The proposed system uses discriminative estimators based on logistic regression, and incorporates spectral and temporal correlations in the same framework. The detector is flexible enough to be configured in a single level or a "stacked" bilevel architecture depending on the needs of the application. An important part of the proposed design is the use of a new set of features: the normalized logarithm of the estimated posterior signal-to-noise ratio. These can be easily and automatically generated by tracking the noise spectrum online. We present results on the AURORA database to demonstrate that the overall design is simple, flexible and effective.</description><subject>Applied sciences</subject><subject>Detectors</subject><subject>Exact sciences and technology</subject><subject>Information, signal and communications theory</subject><subject>Logistics</subject><subject>Noise generators</subject><subject>Signal design</subject><subject>Signal processing</subject><subject>Signal to noise ratio</subject><subject>Spatial databases</subject><subject>Speech coding</subject><subject>Speech enhancement</subject><subject>Speech processing</subject><subject>Telecommunications and information theory</subject><subject>Testing</subject><subject>Time frequency analysis</subject><issn>1520-6149</issn><issn>2379-190X</issn><isbn>9780780384842</isbn><isbn>0780384849</isbn><fulltext>true</fulltext><rsrctype>conference_proceeding</rsrctype><creationdate>2004</creationdate><recordtype>conference_proceeding</recordtype><sourceid>6IE</sourceid><recordid>eNpFkEtLw0AUhQcfYK39Bd1k4zL13nnk3oAbKb6gqJgu3JWZ6aSO1CZkouC_NxBBOHAW5-NwOELMERaIUF49Lm-q6mUhAfQClSRkPhITqajMsYS3YzEriWGQYs1anogJGgl5gbo8E-cpfQAAk-aJuF41u5j66LNtTL6Ln_Fg-_gdstSG4N-zbeiD75suZV8pHnZZ26Q-dLHpsurp9UKc1nafwuzPp2J9d7tePuSr5_th4iqPpeEca-e9ccoRWSCHJJmtwkIDIlENVoK27BR5KKQm3mqDzhQykHRaG1JTcTnWtjZ5u687e_Axbdphre1-NkjF8AmbgZuPXAwh_MfjP-oXzGBWRw</recordid><startdate>2004</startdate><enddate>2004</enddate><creator>Surendran, A.C.</creator><creator>Sukittanon, S.</creator><creator>Platt, J.</creator><general>IEEE</general><scope>6IE</scope><scope>6IH</scope><scope>CBEJK</scope><scope>RIE</scope><scope>RIO</scope><scope>IQODW</scope></search><sort><creationdate>2004</creationdate><title>Logistic discriminative speech detectors using posterior SNR</title><author>Surendran, A.C. ; Sukittanon, S. ; Platt, J.</author></sort><facets><frbrtype>5</frbrtype><frbrgroupid>cdi_FETCH-LOGICAL-i958-1fbcc5b3b77a07b17288a316401177f0a204a8b37c062478d451b562e72b44573</frbrgroupid><rsrctype>conference_proceedings</rsrctype><prefilter>conference_proceedings</prefilter><language>eng</language><creationdate>2004</creationdate><topic>Applied sciences</topic><topic>Detectors</topic><topic>Exact sciences and technology</topic><topic>Information, signal and communications theory</topic><topic>Logistics</topic><topic>Noise generators</topic><topic>Signal design</topic><topic>Signal processing</topic><topic>Signal to noise ratio</topic><topic>Spatial databases</topic><topic>Speech coding</topic><topic>Speech enhancement</topic><topic>Speech processing</topic><topic>Telecommunications and information theory</topic><topic>Testing</topic><topic>Time frequency analysis</topic><toplevel>online_resources</toplevel><creatorcontrib>Surendran, A.C.</creatorcontrib><creatorcontrib>Sukittanon, S.</creatorcontrib><creatorcontrib>Platt, J.</creatorcontrib><collection>IEEE Electronic Library (IEL) Conference Proceedings</collection><collection>IEEE Proceedings Order Plan (POP) 1998-present by volume</collection><collection>IEEE Xplore All Conference Proceedings</collection><collection>IEEE Xplore</collection><collection>IEEE Proceedings Order Plans (POP) 1998-present</collection><collection>Pascal-Francis</collection></facets><delivery><delcategory>Remote Search Resource</delcategory><fulltext>fulltext_linktorsrc</fulltext></delivery><addata><au>Surendran, A.C.</au><au>Sukittanon, S.</au><au>Platt, J.</au><format>book</format><genre>proceeding</genre><ristype>CONF</ristype><atitle>Logistic discriminative speech detectors using posterior SNR</atitle><btitle>2004 IEEE International Conference on Acoustics, Speech, and Signal Processing</btitle><stitle>ICASSP</stitle><date>2004</date><risdate>2004</risdate><volume>5</volume><spage>V</spage><epage>625</epage><pages>V-625</pages><issn>1520-6149</issn><eissn>2379-190X</eissn><isbn>9780780384842</isbn><isbn>0780384849</isbn><abstract>We introduce an elegant and novel design for a speech detector which estimates the probability of the presence of speech in each time-frequency bin, as well as in each frame. The proposed system uses discriminative estimators based on logistic regression, and incorporates spectral and temporal correlations in the same framework. The detector is flexible enough to be configured in a single level or a "stacked" bilevel architecture depending on the needs of the application. An important part of the proposed design is the use of a new set of features: the normalized logarithm of the estimated posterior signal-to-noise ratio. These can be easily and automatically generated by tracking the noise spectrum online. We present results on the AURORA database to demonstrate that the overall design is simple, flexible and effective.</abstract><cop>Piscataway, N.J</cop><pub>IEEE</pub><doi>10.1109/ICASSP.2004.1327188</doi><oa>free_for_read</oa></addata></record>
fulltext fulltext_linktorsrc
identifier ISSN: 1520-6149
ispartof 2004 IEEE International Conference on Acoustics, Speech, and Signal Processing, 2004, Vol.5, p.V-625
issn 1520-6149
2379-190X
language eng
recordid cdi_ieee_primary_1327188
source IEEE Electronic Library (IEL) Conference Proceedings
subjects Applied sciences
Detectors
Exact sciences and technology
Information, signal and communications theory
Logistics
Noise generators
Signal design
Signal processing
Signal to noise ratio
Spatial databases
Speech coding
Speech enhancement
Speech processing
Telecommunications and information theory
Testing
Time frequency analysis
title Logistic discriminative speech detectors using posterior SNR
url http://sfxeu10.hosted.exlibrisgroup.com/loughborough?ctx_ver=Z39.88-2004&ctx_enc=info:ofi/enc:UTF-8&ctx_tim=2024-12-27T18%3A48%3A33IST&url_ver=Z39.88-2004&url_ctx_fmt=infofi/fmt:kev:mtx:ctx&rfr_id=info:sid/primo.exlibrisgroup.com:primo3-Article-pascalfrancis_6IE&rft_val_fmt=info:ofi/fmt:kev:mtx:book&rft.genre=proceeding&rft.atitle=Logistic%20discriminative%20speech%20detectors%20using%20posterior%20SNR&rft.btitle=2004%20IEEE%20International%20Conference%20on%20Acoustics,%20Speech,%20and%20Signal%20Processing&rft.au=Surendran,%20A.C.&rft.date=2004&rft.volume=5&rft.spage=V&rft.epage=625&rft.pages=V-625&rft.issn=1520-6149&rft.eissn=2379-190X&rft.isbn=9780780384842&rft.isbn_list=0780384849&rft_id=info:doi/10.1109/ICASSP.2004.1327188&rft_dat=%3Cpascalfrancis_6IE%3E17610985%3C/pascalfrancis_6IE%3E%3Cgrp_id%3Ecdi_FETCH-LOGICAL-i958-1fbcc5b3b77a07b17288a316401177f0a204a8b37c062478d451b562e72b44573%3C/grp_id%3E%3Coa%3E%3C/oa%3E%3Curl%3E%3C/url%3E&rft_id=info:oai/&rft_id=info:pmid/&rft_ieee_id=1327188&rfr_iscdi=true