Loading…

Fast sequential floating forward selection applied to emotional speech features estimated on DES and SUSAS data collections

In this paper, we classify speech into several emotional states based on the statistical properties of prosody features estimated on utterances extracted from Danish Emotional Speech (DES) and a subset of Speech Under Simulated and Actual Stress (SUSAS) data collections. The proposed novelties are i...

Full description

Saved in:
Bibliographic Details
Main Authors: Ververidis, Dimitrios, Kotropoulos, Constantine
Format: Conference Proceeding
Language:English
Subjects:
Online Access:Request full text
Tags: Add Tag
No Tags, Be the first to tag this record!
cited_by
cites
container_end_page 5
container_issue
container_start_page 1
container_title
container_volume
creator Ververidis, Dimitrios
Kotropoulos, Constantine
description In this paper, we classify speech into several emotional states based on the statistical properties of prosody features estimated on utterances extracted from Danish Emotional Speech (DES) and a subset of Speech Under Simulated and Actual Stress (SUSAS) data collections. The proposed novelties are in: 1) speeding up the sequential floating feature selection up to 60%, 2) applying fusion of decisions taken on short speech segments in order to derive a unique decision for longer utterances, and 3) demonstrating that gender and accent information reduce the classification error. Indeed, a lower classification error by 1% to 11% is achieved, when the combination of decisions is made on long phrases and an error reduction by 2%-11% is obtained, when the gender and the accent information is exploited. The total classification error reported on DES is 42.8%. The same figure on SUSAS is 46.3%. The reported human errors have been 32.3% in DES and 42% in SUSAS. For comparison purposes, a random classification would yield an error of 80% in DES and 87.5% in SUSAS, respectively.
format conference_proceeding
fullrecord <record><control><sourceid>ieee_CHZPO</sourceid><recordid>TN_cdi_ieee_primary_7071406</recordid><sourceformat>XML</sourceformat><sourcesystem>PC</sourcesystem><ieee_id>7071406</ieee_id><sourcerecordid>7071406</sourcerecordid><originalsourceid>FETCH-LOGICAL-i90t-ea0718975ad8413c753bdd9942f24902901c43c450cfcf770404838d2193c373</originalsourceid><addsrcrecordid>eNpNj0FLAzEQhRdRsNT-Ai_zBxaySbbZHEttVSh4WD2XMZloJN2smxQR_7wRe-jpDfPefMy7qGacN7pupW4uz-brapHSB2NMcCZavpxVP1tMGRJ9HmnIHgO4EDH74Q1cnL5wssULZLKPA-A4Bk8WcgQ6xL9VyaeRyLyDI8zHiRJQyv6AucTKxd2mBxws9C_9qgeLGcHEcOKlm-rKYUi0OOm86reb5_VDvXu6f1yvdrXXLNeETDWdVi3aTjbCqFa8Wqu15I5LzbhmjZHCyJYZZ5xSTDLZic6WzsIIJebV7T_VE9F-nMp30_deFahkS_ELw5RX9Q</addsrcrecordid><sourcetype>Publisher</sourcetype><iscdi>true</iscdi><recordtype>conference_proceeding</recordtype></control><display><type>conference_proceeding</type><title>Fast sequential floating forward selection applied to emotional speech features estimated on DES and SUSAS data collections</title><source>IEEE Xplore All Conference Series</source><creator>Ververidis, Dimitrios ; Kotropoulos, Constantine</creator><creatorcontrib>Ververidis, Dimitrios ; Kotropoulos, Constantine</creatorcontrib><description>In this paper, we classify speech into several emotional states based on the statistical properties of prosody features estimated on utterances extracted from Danish Emotional Speech (DES) and a subset of Speech Under Simulated and Actual Stress (SUSAS) data collections. The proposed novelties are in: 1) speeding up the sequential floating feature selection up to 60%, 2) applying fusion of decisions taken on short speech segments in order to derive a unique decision for longer utterances, and 3) demonstrating that gender and accent information reduce the classification error. Indeed, a lower classification error by 1% to 11% is achieved, when the combination of decisions is made on long phrases and an error reduction by 2%-11% is obtained, when the gender and the accent information is exploited. The total classification error reported on DES is 42.8%. The same figure on SUSAS is 46.3%. The reported human errors have been 32.3% in DES and 42% in SUSAS. For comparison purposes, a random classification would yield an error of 80% in DES and 87.5% in SUSAS, respectively.</description><identifier>ISSN: 2219-5491</identifier><identifier>EISSN: 2219-5491</identifier><language>eng</language><publisher>IEEE</publisher><subject>Europe ; Feature extraction ; Signal processing algorithms ; Speech ; Speech processing ; Stress</subject><ispartof>2006 14th European Signal Processing Conference, 2006, p.1-5</ispartof><woscitedreferencessubscribed>false</woscitedreferencessubscribed></display><links><openurl>$$Topenurl_article</openurl><openurlfulltext>$$Topenurlfull_article</openurlfulltext><thumbnail>$$Tsyndetics_thumb_exl</thumbnail><linktohtml>$$Uhttps://ieeexplore.ieee.org/document/7071406$$EHTML$$P50$$Gieee$$H</linktohtml><link.rule.ids>309,310,776,780,785,786,23909,23910,25118,54530,54907</link.rule.ids><linktorsrc>$$Uhttps://ieeexplore.ieee.org/document/7071406$$EView_record_in_IEEE$$FView_record_in_$$GIEEE</linktorsrc></links><search><creatorcontrib>Ververidis, Dimitrios</creatorcontrib><creatorcontrib>Kotropoulos, Constantine</creatorcontrib><title>Fast sequential floating forward selection applied to emotional speech features estimated on DES and SUSAS data collections</title><title>2006 14th European Signal Processing Conference</title><addtitle>EUSIPCO</addtitle><description>In this paper, we classify speech into several emotional states based on the statistical properties of prosody features estimated on utterances extracted from Danish Emotional Speech (DES) and a subset of Speech Under Simulated and Actual Stress (SUSAS) data collections. The proposed novelties are in: 1) speeding up the sequential floating feature selection up to 60%, 2) applying fusion of decisions taken on short speech segments in order to derive a unique decision for longer utterances, and 3) demonstrating that gender and accent information reduce the classification error. Indeed, a lower classification error by 1% to 11% is achieved, when the combination of decisions is made on long phrases and an error reduction by 2%-11% is obtained, when the gender and the accent information is exploited. The total classification error reported on DES is 42.8%. The same figure on SUSAS is 46.3%. The reported human errors have been 32.3% in DES and 42% in SUSAS. For comparison purposes, a random classification would yield an error of 80% in DES and 87.5% in SUSAS, respectively.</description><subject>Europe</subject><subject>Feature extraction</subject><subject>Signal processing algorithms</subject><subject>Speech</subject><subject>Speech processing</subject><subject>Stress</subject><issn>2219-5491</issn><issn>2219-5491</issn><fulltext>true</fulltext><rsrctype>conference_proceeding</rsrctype><creationdate>2006</creationdate><recordtype>conference_proceeding</recordtype><sourceid>6IE</sourceid><recordid>eNpNj0FLAzEQhRdRsNT-Ai_zBxaySbbZHEttVSh4WD2XMZloJN2smxQR_7wRe-jpDfPefMy7qGacN7pupW4uz-brapHSB2NMcCZavpxVP1tMGRJ9HmnIHgO4EDH74Q1cnL5wssULZLKPA-A4Bk8WcgQ6xL9VyaeRyLyDI8zHiRJQyv6AucTKxd2mBxws9C_9qgeLGcHEcOKlm-rKYUi0OOm86reb5_VDvXu6f1yvdrXXLNeETDWdVi3aTjbCqFa8Wqu15I5LzbhmjZHCyJYZZ5xSTDLZic6WzsIIJebV7T_VE9F-nMp30_deFahkS_ELw5RX9Q</recordid><startdate>200609</startdate><enddate>200609</enddate><creator>Ververidis, Dimitrios</creator><creator>Kotropoulos, Constantine</creator><general>IEEE</general><scope>6IE</scope><scope>6IL</scope><scope>CBEJK</scope><scope>RIE</scope><scope>RIL</scope></search><sort><creationdate>200609</creationdate><title>Fast sequential floating forward selection applied to emotional speech features estimated on DES and SUSAS data collections</title><author>Ververidis, Dimitrios ; Kotropoulos, Constantine</author></sort><facets><frbrtype>5</frbrtype><frbrgroupid>cdi_FETCH-LOGICAL-i90t-ea0718975ad8413c753bdd9942f24902901c43c450cfcf770404838d2193c373</frbrgroupid><rsrctype>conference_proceedings</rsrctype><prefilter>conference_proceedings</prefilter><language>eng</language><creationdate>2006</creationdate><topic>Europe</topic><topic>Feature extraction</topic><topic>Signal processing algorithms</topic><topic>Speech</topic><topic>Speech processing</topic><topic>Stress</topic><toplevel>online_resources</toplevel><creatorcontrib>Ververidis, Dimitrios</creatorcontrib><creatorcontrib>Kotropoulos, Constantine</creatorcontrib><collection>IEEE Electronic Library (IEL) Conference Proceedings</collection><collection>IEEE Proceedings Order Plan All Online (POP All Online) 1998-present by volume</collection><collection>IEEE Xplore All Conference Proceedings</collection><collection>IEEE Electronic Library Online</collection><collection>IEEE Proceedings Order Plans (POP All) 1998-Present</collection></facets><delivery><delcategory>Remote Search Resource</delcategory><fulltext>fulltext_linktorsrc</fulltext></delivery><addata><au>Ververidis, Dimitrios</au><au>Kotropoulos, Constantine</au><format>book</format><genre>proceeding</genre><ristype>CONF</ristype><atitle>Fast sequential floating forward selection applied to emotional speech features estimated on DES and SUSAS data collections</atitle><btitle>2006 14th European Signal Processing Conference</btitle><stitle>EUSIPCO</stitle><date>2006-09</date><risdate>2006</risdate><spage>1</spage><epage>5</epage><pages>1-5</pages><issn>2219-5491</issn><eissn>2219-5491</eissn><abstract>In this paper, we classify speech into several emotional states based on the statistical properties of prosody features estimated on utterances extracted from Danish Emotional Speech (DES) and a subset of Speech Under Simulated and Actual Stress (SUSAS) data collections. The proposed novelties are in: 1) speeding up the sequential floating feature selection up to 60%, 2) applying fusion of decisions taken on short speech segments in order to derive a unique decision for longer utterances, and 3) demonstrating that gender and accent information reduce the classification error. Indeed, a lower classification error by 1% to 11% is achieved, when the combination of decisions is made on long phrases and an error reduction by 2%-11% is obtained, when the gender and the accent information is exploited. The total classification error reported on DES is 42.8%. The same figure on SUSAS is 46.3%. The reported human errors have been 32.3% in DES and 42% in SUSAS. For comparison purposes, a random classification would yield an error of 80% in DES and 87.5% in SUSAS, respectively.</abstract><pub>IEEE</pub><tpages>5</tpages></addata></record>
fulltext fulltext_linktorsrc
identifier ISSN: 2219-5491
ispartof 2006 14th European Signal Processing Conference, 2006, p.1-5
issn 2219-5491
2219-5491
language eng
recordid cdi_ieee_primary_7071406
source IEEE Xplore All Conference Series
subjects Europe
Feature extraction
Signal processing algorithms
Speech
Speech processing
Stress
title Fast sequential floating forward selection applied to emotional speech features estimated on DES and SUSAS data collections
url http://sfxeu10.hosted.exlibrisgroup.com/loughborough?ctx_ver=Z39.88-2004&ctx_enc=info:ofi/enc:UTF-8&ctx_tim=2025-01-29T23%3A56%3A38IST&url_ver=Z39.88-2004&url_ctx_fmt=infofi/fmt:kev:mtx:ctx&rfr_id=info:sid/primo.exlibrisgroup.com:primo3-Article-ieee_CHZPO&rft_val_fmt=info:ofi/fmt:kev:mtx:book&rft.genre=proceeding&rft.atitle=Fast%20sequential%20floating%20forward%20selection%20applied%20to%20emotional%20speech%20features%20estimated%20on%20DES%20and%20SUSAS%20data%20collections&rft.btitle=2006%2014th%20European%20Signal%20Processing%20Conference&rft.au=Ververidis,%20Dimitrios&rft.date=2006-09&rft.spage=1&rft.epage=5&rft.pages=1-5&rft.issn=2219-5491&rft.eissn=2219-5491&rft_id=info:doi/&rft_dat=%3Cieee_CHZPO%3E7071406%3C/ieee_CHZPO%3E%3Cgrp_id%3Ecdi_FETCH-LOGICAL-i90t-ea0718975ad8413c753bdd9942f24902901c43c450cfcf770404838d2193c373%3C/grp_id%3E%3Coa%3E%3C/oa%3E%3Curl%3E%3C/url%3E&rft_id=info:oai/&rft_id=info:pmid/&rft_ieee_id=7071406&rfr_iscdi=true