Loading…
Fast sequential floating forward selection applied to emotional speech features estimated on DES and SUSAS data collections
In this paper, we classify speech into several emotional states based on the statistical properties of prosody features estimated on utterances extracted from Danish Emotional Speech (DES) and a subset of Speech Under Simulated and Actual Stress (SUSAS) data collections. The proposed novelties are i...
Saved in:
Main Authors: | , |
---|---|
Format: | Conference Proceeding |
Language: | English |
Subjects: | |
Online Access: | Request full text |
Tags: |
Add Tag
No Tags, Be the first to tag this record!
|
cited_by | |
---|---|
cites | |
container_end_page | 5 |
container_issue | |
container_start_page | 1 |
container_title | |
container_volume | |
creator | Ververidis, Dimitrios Kotropoulos, Constantine |
description | In this paper, we classify speech into several emotional states based on the statistical properties of prosody features estimated on utterances extracted from Danish Emotional Speech (DES) and a subset of Speech Under Simulated and Actual Stress (SUSAS) data collections. The proposed novelties are in: 1) speeding up the sequential floating feature selection up to 60%, 2) applying fusion of decisions taken on short speech segments in order to derive a unique decision for longer utterances, and 3) demonstrating that gender and accent information reduce the classification error. Indeed, a lower classification error by 1% to 11% is achieved, when the combination of decisions is made on long phrases and an error reduction by 2%-11% is obtained, when the gender and the accent information is exploited. The total classification error reported on DES is 42.8%. The same figure on SUSAS is 46.3%. The reported human errors have been 32.3% in DES and 42% in SUSAS. For comparison purposes, a random classification would yield an error of 80% in DES and 87.5% in SUSAS, respectively. |
format | conference_proceeding |
fullrecord | <record><control><sourceid>ieee_CHZPO</sourceid><recordid>TN_cdi_ieee_primary_7071406</recordid><sourceformat>XML</sourceformat><sourcesystem>PC</sourcesystem><ieee_id>7071406</ieee_id><sourcerecordid>7071406</sourcerecordid><originalsourceid>FETCH-LOGICAL-i90t-ea0718975ad8413c753bdd9942f24902901c43c450cfcf770404838d2193c373</originalsourceid><addsrcrecordid>eNpNj0FLAzEQhRdRsNT-Ai_zBxaySbbZHEttVSh4WD2XMZloJN2smxQR_7wRe-jpDfPefMy7qGacN7pupW4uz-brapHSB2NMcCZavpxVP1tMGRJ9HmnIHgO4EDH74Q1cnL5wssULZLKPA-A4Bk8WcgQ6xL9VyaeRyLyDI8zHiRJQyv6AucTKxd2mBxws9C_9qgeLGcHEcOKlm-rKYUi0OOm86reb5_VDvXu6f1yvdrXXLNeETDWdVi3aTjbCqFa8Wqu15I5LzbhmjZHCyJYZZ5xSTDLZic6WzsIIJebV7T_VE9F-nMp30_deFahkS_ELw5RX9Q</addsrcrecordid><sourcetype>Publisher</sourcetype><iscdi>true</iscdi><recordtype>conference_proceeding</recordtype></control><display><type>conference_proceeding</type><title>Fast sequential floating forward selection applied to emotional speech features estimated on DES and SUSAS data collections</title><source>IEEE Xplore All Conference Series</source><creator>Ververidis, Dimitrios ; Kotropoulos, Constantine</creator><creatorcontrib>Ververidis, Dimitrios ; Kotropoulos, Constantine</creatorcontrib><description>In this paper, we classify speech into several emotional states based on the statistical properties of prosody features estimated on utterances extracted from Danish Emotional Speech (DES) and a subset of Speech Under Simulated and Actual Stress (SUSAS) data collections. The proposed novelties are in: 1) speeding up the sequential floating feature selection up to 60%, 2) applying fusion of decisions taken on short speech segments in order to derive a unique decision for longer utterances, and 3) demonstrating that gender and accent information reduce the classification error. Indeed, a lower classification error by 1% to 11% is achieved, when the combination of decisions is made on long phrases and an error reduction by 2%-11% is obtained, when the gender and the accent information is exploited. The total classification error reported on DES is 42.8%. The same figure on SUSAS is 46.3%. The reported human errors have been 32.3% in DES and 42% in SUSAS. For comparison purposes, a random classification would yield an error of 80% in DES and 87.5% in SUSAS, respectively.</description><identifier>ISSN: 2219-5491</identifier><identifier>EISSN: 2219-5491</identifier><language>eng</language><publisher>IEEE</publisher><subject>Europe ; Feature extraction ; Signal processing algorithms ; Speech ; Speech processing ; Stress</subject><ispartof>2006 14th European Signal Processing Conference, 2006, p.1-5</ispartof><woscitedreferencessubscribed>false</woscitedreferencessubscribed></display><links><openurl>$$Topenurl_article</openurl><openurlfulltext>$$Topenurlfull_article</openurlfulltext><thumbnail>$$Tsyndetics_thumb_exl</thumbnail><linktohtml>$$Uhttps://ieeexplore.ieee.org/document/7071406$$EHTML$$P50$$Gieee$$H</linktohtml><link.rule.ids>309,310,776,780,785,786,23909,23910,25118,54530,54907</link.rule.ids><linktorsrc>$$Uhttps://ieeexplore.ieee.org/document/7071406$$EView_record_in_IEEE$$FView_record_in_$$GIEEE</linktorsrc></links><search><creatorcontrib>Ververidis, Dimitrios</creatorcontrib><creatorcontrib>Kotropoulos, Constantine</creatorcontrib><title>Fast sequential floating forward selection applied to emotional speech features estimated on DES and SUSAS data collections</title><title>2006 14th European Signal Processing Conference</title><addtitle>EUSIPCO</addtitle><description>In this paper, we classify speech into several emotional states based on the statistical properties of prosody features estimated on utterances extracted from Danish Emotional Speech (DES) and a subset of Speech Under Simulated and Actual Stress (SUSAS) data collections. The proposed novelties are in: 1) speeding up the sequential floating feature selection up to 60%, 2) applying fusion of decisions taken on short speech segments in order to derive a unique decision for longer utterances, and 3) demonstrating that gender and accent information reduce the classification error. Indeed, a lower classification error by 1% to 11% is achieved, when the combination of decisions is made on long phrases and an error reduction by 2%-11% is obtained, when the gender and the accent information is exploited. The total classification error reported on DES is 42.8%. The same figure on SUSAS is 46.3%. The reported human errors have been 32.3% in DES and 42% in SUSAS. For comparison purposes, a random classification would yield an error of 80% in DES and 87.5% in SUSAS, respectively.</description><subject>Europe</subject><subject>Feature extraction</subject><subject>Signal processing algorithms</subject><subject>Speech</subject><subject>Speech processing</subject><subject>Stress</subject><issn>2219-5491</issn><issn>2219-5491</issn><fulltext>true</fulltext><rsrctype>conference_proceeding</rsrctype><creationdate>2006</creationdate><recordtype>conference_proceeding</recordtype><sourceid>6IE</sourceid><recordid>eNpNj0FLAzEQhRdRsNT-Ai_zBxaySbbZHEttVSh4WD2XMZloJN2smxQR_7wRe-jpDfPefMy7qGacN7pupW4uz-brapHSB2NMcCZavpxVP1tMGRJ9HmnIHgO4EDH74Q1cnL5wssULZLKPA-A4Bk8WcgQ6xL9VyaeRyLyDI8zHiRJQyv6AucTKxd2mBxws9C_9qgeLGcHEcOKlm-rKYUi0OOm86reb5_VDvXu6f1yvdrXXLNeETDWdVi3aTjbCqFa8Wqu15I5LzbhmjZHCyJYZZ5xSTDLZic6WzsIIJebV7T_VE9F-nMp30_deFahkS_ELw5RX9Q</recordid><startdate>200609</startdate><enddate>200609</enddate><creator>Ververidis, Dimitrios</creator><creator>Kotropoulos, Constantine</creator><general>IEEE</general><scope>6IE</scope><scope>6IL</scope><scope>CBEJK</scope><scope>RIE</scope><scope>RIL</scope></search><sort><creationdate>200609</creationdate><title>Fast sequential floating forward selection applied to emotional speech features estimated on DES and SUSAS data collections</title><author>Ververidis, Dimitrios ; Kotropoulos, Constantine</author></sort><facets><frbrtype>5</frbrtype><frbrgroupid>cdi_FETCH-LOGICAL-i90t-ea0718975ad8413c753bdd9942f24902901c43c450cfcf770404838d2193c373</frbrgroupid><rsrctype>conference_proceedings</rsrctype><prefilter>conference_proceedings</prefilter><language>eng</language><creationdate>2006</creationdate><topic>Europe</topic><topic>Feature extraction</topic><topic>Signal processing algorithms</topic><topic>Speech</topic><topic>Speech processing</topic><topic>Stress</topic><toplevel>online_resources</toplevel><creatorcontrib>Ververidis, Dimitrios</creatorcontrib><creatorcontrib>Kotropoulos, Constantine</creatorcontrib><collection>IEEE Electronic Library (IEL) Conference Proceedings</collection><collection>IEEE Proceedings Order Plan All Online (POP All Online) 1998-present by volume</collection><collection>IEEE Xplore All Conference Proceedings</collection><collection>IEEE Electronic Library Online</collection><collection>IEEE Proceedings Order Plans (POP All) 1998-Present</collection></facets><delivery><delcategory>Remote Search Resource</delcategory><fulltext>fulltext_linktorsrc</fulltext></delivery><addata><au>Ververidis, Dimitrios</au><au>Kotropoulos, Constantine</au><format>book</format><genre>proceeding</genre><ristype>CONF</ristype><atitle>Fast sequential floating forward selection applied to emotional speech features estimated on DES and SUSAS data collections</atitle><btitle>2006 14th European Signal Processing Conference</btitle><stitle>EUSIPCO</stitle><date>2006-09</date><risdate>2006</risdate><spage>1</spage><epage>5</epage><pages>1-5</pages><issn>2219-5491</issn><eissn>2219-5491</eissn><abstract>In this paper, we classify speech into several emotional states based on the statistical properties of prosody features estimated on utterances extracted from Danish Emotional Speech (DES) and a subset of Speech Under Simulated and Actual Stress (SUSAS) data collections. The proposed novelties are in: 1) speeding up the sequential floating feature selection up to 60%, 2) applying fusion of decisions taken on short speech segments in order to derive a unique decision for longer utterances, and 3) demonstrating that gender and accent information reduce the classification error. Indeed, a lower classification error by 1% to 11% is achieved, when the combination of decisions is made on long phrases and an error reduction by 2%-11% is obtained, when the gender and the accent information is exploited. The total classification error reported on DES is 42.8%. The same figure on SUSAS is 46.3%. The reported human errors have been 32.3% in DES and 42% in SUSAS. For comparison purposes, a random classification would yield an error of 80% in DES and 87.5% in SUSAS, respectively.</abstract><pub>IEEE</pub><tpages>5</tpages></addata></record> |
fulltext | fulltext_linktorsrc |
identifier | ISSN: 2219-5491 |
ispartof | 2006 14th European Signal Processing Conference, 2006, p.1-5 |
issn | 2219-5491 2219-5491 |
language | eng |
recordid | cdi_ieee_primary_7071406 |
source | IEEE Xplore All Conference Series |
subjects | Europe Feature extraction Signal processing algorithms Speech Speech processing Stress |
title | Fast sequential floating forward selection applied to emotional speech features estimated on DES and SUSAS data collections |
url | http://sfxeu10.hosted.exlibrisgroup.com/loughborough?ctx_ver=Z39.88-2004&ctx_enc=info:ofi/enc:UTF-8&ctx_tim=2025-01-29T23%3A56%3A38IST&url_ver=Z39.88-2004&url_ctx_fmt=infofi/fmt:kev:mtx:ctx&rfr_id=info:sid/primo.exlibrisgroup.com:primo3-Article-ieee_CHZPO&rft_val_fmt=info:ofi/fmt:kev:mtx:book&rft.genre=proceeding&rft.atitle=Fast%20sequential%20floating%20forward%20selection%20applied%20to%20emotional%20speech%20features%20estimated%20on%20DES%20and%20SUSAS%20data%20collections&rft.btitle=2006%2014th%20European%20Signal%20Processing%20Conference&rft.au=Ververidis,%20Dimitrios&rft.date=2006-09&rft.spage=1&rft.epage=5&rft.pages=1-5&rft.issn=2219-5491&rft.eissn=2219-5491&rft_id=info:doi/&rft_dat=%3Cieee_CHZPO%3E7071406%3C/ieee_CHZPO%3E%3Cgrp_id%3Ecdi_FETCH-LOGICAL-i90t-ea0718975ad8413c753bdd9942f24902901c43c450cfcf770404838d2193c373%3C/grp_id%3E%3Coa%3E%3C/oa%3E%3Curl%3E%3C/url%3E&rft_id=info:oai/&rft_id=info:pmid/&rft_ieee_id=7071406&rfr_iscdi=true |