Loading…
Comparison of Four Approaches to Age and Gender Recognition for Telephone Applications
This paper presents a comparative study of four different approaches to automatic age and gender classification using seven classes on a telephony speech task and also compares the results with human performance on the same data. The automatic approaches compared are based on (1) a parallel phone re...
Saved in:
Main Authors: | , , , , , , , , , , |
---|---|
Format: | Conference Proceeding |
Language: | English |
Subjects: | |
Online Access: | Request full text |
Tags: |
Add Tag
No Tags, Be the first to tag this record!
|
cited_by | |
---|---|
cites | |
container_end_page | IV-1092 |
container_issue | |
container_start_page | IV-1089 |
container_title | |
container_volume | 4 |
creator | Metze, F. Ajmera, J. Englert, R. Bub, U. Burkhardt, F. Stegmann, J. Muller, C. Huber, R. Andrassy, B. Bauer, J. G. Littel, B. |
description | This paper presents a comparative study of four different approaches to automatic age and gender classification using seven classes on a telephony speech task and also compares the results with human performance on the same data. The automatic approaches compared are based on (1) a parallel phone recognizer, derived from an automatic language identification system; (2) a system using dynamic Bayesian networks to combine several prosodic features; (3) a system based solely on linear prediction analysis; and (4) Gaussian mixture models based on MFCCs for separate recognition of age and gender. On average, the parallel phone recognizer performs as well as Human listeners do, while loosing performance on short utterances. The system based on prosodic features however shows very little dependence on the length of the utterance. |
doi_str_mv | 10.1109/ICASSP.2007.367263 |
format | conference_proceeding |
fullrecord | <record><control><sourceid>ieee_6IE</sourceid><recordid>TN_cdi_ieee_primary_4218294</recordid><sourceformat>XML</sourceformat><sourcesystem>PC</sourcesystem><ieee_id>4218294</ieee_id><sourcerecordid>4218294</sourcerecordid><originalsourceid>FETCH-LOGICAL-i219t-bde79a77dbda9fa18c963b8682640ec27d3d1c45ef477d9bf248d751af6151873</originalsourceid><addsrcrecordid>eNpVj8tOAjEYhestEZEX0E1fYLB_2-llSYigCYlG0LgjnelfqIHppIML3t4hunF1Ft_5TnIIuQM2BmD24Xk6WS5fx5wxPRZKcyXOyMhqA5JLyTQ36pwMuNC2AMs-L_4xbS_JAErOCgXSXpObrvtijBktzYB8TNO-dTl2qaEp0Fn6znTStjm5eosdPSQ62SB1jadzbDxm-oZ12jTxEHshpExXuMN2mxo8abtYuxPpbslVcLsOR385JO-zx9X0qVi8zPsviyJysIei8qit09pX3tngwNRWicoow5VkWHPthYdalhhkX7JV4NJ4XYILCkowWgzJ_e9uRMR1m-Pe5eNacjDcSvEDxjBWvg</addsrcrecordid><sourcetype>Publisher</sourcetype><iscdi>true</iscdi><recordtype>conference_proceeding</recordtype></control><display><type>conference_proceeding</type><title>Comparison of Four Approaches to Age and Gender Recognition for Telephone Applications</title><source>IEEE Electronic Library (IEL) Conference Proceedings</source><creator>Metze, F. ; Ajmera, J. ; Englert, R. ; Bub, U. ; Burkhardt, F. ; Stegmann, J. ; Muller, C. ; Huber, R. ; Andrassy, B. ; Bauer, J. G. ; Littel, B.</creator><creatorcontrib>Metze, F. ; Ajmera, J. ; Englert, R. ; Bub, U. ; Burkhardt, F. ; Stegmann, J. ; Muller, C. ; Huber, R. ; Andrassy, B. ; Bauer, J. G. ; Littel, B.</creatorcontrib><description>This paper presents a comparative study of four different approaches to automatic age and gender classification using seven classes on a telephony speech task and also compares the results with human performance on the same data. The automatic approaches compared are based on (1) a parallel phone recognizer, derived from an automatic language identification system; (2) a system using dynamic Bayesian networks to combine several prosodic features; (3) a system based solely on linear prediction analysis; and (4) Gaussian mixture models based on MFCCs for separate recognition of age and gender. On average, the parallel phone recognizer performs as well as Human listeners do, while loosing performance on short utterances. The system based on prosodic features however shows very little dependence on the length of the utterance.</description><identifier>ISSN: 1520-6149</identifier><identifier>ISBN: 9781424407279</identifier><identifier>ISBN: 1424407273</identifier><identifier>EISSN: 2379-190X</identifier><identifier>EISBN: 9781424407286</identifier><identifier>EISBN: 1424407281</identifier><identifier>DOI: 10.1109/ICASSP.2007.367263</identifier><language>eng</language><publisher>IEEE</publisher><subject>acoustic signal analysis ; age ; Application specific integrated circuits ; Automatic speech recognition ; Bayesian methods ; gender ; Hidden Markov models ; Humans ; Laboratories ; Linear discriminant analysis ; speaker classification ; speech processing ; Telephony ; Testing</subject><ispartof>2007 IEEE International Conference on Acoustics, Speech and Signal Processing - ICASSP '07, 2007, Vol.4, p.IV-1089-IV-1092</ispartof><oa>free_for_read</oa><woscitedreferencessubscribed>false</woscitedreferencessubscribed></display><links><openurl>$$Topenurl_article</openurl><openurlfulltext>$$Topenurlfull_article</openurlfulltext><thumbnail>$$Tsyndetics_thumb_exl</thumbnail><linktohtml>$$Uhttps://ieeexplore.ieee.org/document/4218294$$EHTML$$P50$$Gieee$$H</linktohtml><link.rule.ids>309,310,780,784,789,790,2058,27925,54555,54920,54932</link.rule.ids><linktorsrc>$$Uhttps://ieeexplore.ieee.org/document/4218294$$EView_record_in_IEEE$$FView_record_in_$$GIEEE</linktorsrc></links><search><creatorcontrib>Metze, F.</creatorcontrib><creatorcontrib>Ajmera, J.</creatorcontrib><creatorcontrib>Englert, R.</creatorcontrib><creatorcontrib>Bub, U.</creatorcontrib><creatorcontrib>Burkhardt, F.</creatorcontrib><creatorcontrib>Stegmann, J.</creatorcontrib><creatorcontrib>Muller, C.</creatorcontrib><creatorcontrib>Huber, R.</creatorcontrib><creatorcontrib>Andrassy, B.</creatorcontrib><creatorcontrib>Bauer, J. G.</creatorcontrib><creatorcontrib>Littel, B.</creatorcontrib><title>Comparison of Four Approaches to Age and Gender Recognition for Telephone Applications</title><title>2007 IEEE International Conference on Acoustics, Speech and Signal Processing - ICASSP '07</title><addtitle>ICASSP</addtitle><description>This paper presents a comparative study of four different approaches to automatic age and gender classification using seven classes on a telephony speech task and also compares the results with human performance on the same data. The automatic approaches compared are based on (1) a parallel phone recognizer, derived from an automatic language identification system; (2) a system using dynamic Bayesian networks to combine several prosodic features; (3) a system based solely on linear prediction analysis; and (4) Gaussian mixture models based on MFCCs for separate recognition of age and gender. On average, the parallel phone recognizer performs as well as Human listeners do, while loosing performance on short utterances. The system based on prosodic features however shows very little dependence on the length of the utterance.</description><subject>acoustic signal analysis</subject><subject>age</subject><subject>Application specific integrated circuits</subject><subject>Automatic speech recognition</subject><subject>Bayesian methods</subject><subject>gender</subject><subject>Hidden Markov models</subject><subject>Humans</subject><subject>Laboratories</subject><subject>Linear discriminant analysis</subject><subject>speaker classification</subject><subject>speech processing</subject><subject>Telephony</subject><subject>Testing</subject><issn>1520-6149</issn><issn>2379-190X</issn><isbn>9781424407279</isbn><isbn>1424407273</isbn><isbn>9781424407286</isbn><isbn>1424407281</isbn><fulltext>true</fulltext><rsrctype>conference_proceeding</rsrctype><creationdate>2007</creationdate><recordtype>conference_proceeding</recordtype><sourceid>6IE</sourceid><recordid>eNpVj8tOAjEYhestEZEX0E1fYLB_2-llSYigCYlG0LgjnelfqIHppIML3t4hunF1Ft_5TnIIuQM2BmD24Xk6WS5fx5wxPRZKcyXOyMhqA5JLyTQ36pwMuNC2AMs-L_4xbS_JAErOCgXSXpObrvtijBktzYB8TNO-dTl2qaEp0Fn6znTStjm5eosdPSQ62SB1jadzbDxm-oZ12jTxEHshpExXuMN2mxo8abtYuxPpbslVcLsOR385JO-zx9X0qVi8zPsviyJysIei8qit09pX3tngwNRWicoow5VkWHPthYdalhhkX7JV4NJ4XYILCkowWgzJ_e9uRMR1m-Pe5eNacjDcSvEDxjBWvg</recordid><startdate>20070101</startdate><enddate>20070101</enddate><creator>Metze, F.</creator><creator>Ajmera, J.</creator><creator>Englert, R.</creator><creator>Bub, U.</creator><creator>Burkhardt, F.</creator><creator>Stegmann, J.</creator><creator>Muller, C.</creator><creator>Huber, R.</creator><creator>Andrassy, B.</creator><creator>Bauer, J. G.</creator><creator>Littel, B.</creator><general>IEEE</general><scope>6IE</scope><scope>6IH</scope><scope>CBEJK</scope><scope>RIE</scope><scope>RIO</scope></search><sort><creationdate>20070101</creationdate><title>Comparison of Four Approaches to Age and Gender Recognition for Telephone Applications</title><author>Metze, F. ; Ajmera, J. ; Englert, R. ; Bub, U. ; Burkhardt, F. ; Stegmann, J. ; Muller, C. ; Huber, R. ; Andrassy, B. ; Bauer, J. G. ; Littel, B.</author></sort><facets><frbrtype>5</frbrtype><frbrgroupid>cdi_FETCH-LOGICAL-i219t-bde79a77dbda9fa18c963b8682640ec27d3d1c45ef477d9bf248d751af6151873</frbrgroupid><rsrctype>conference_proceedings</rsrctype><prefilter>conference_proceedings</prefilter><language>eng</language><creationdate>2007</creationdate><topic>acoustic signal analysis</topic><topic>age</topic><topic>Application specific integrated circuits</topic><topic>Automatic speech recognition</topic><topic>Bayesian methods</topic><topic>gender</topic><topic>Hidden Markov models</topic><topic>Humans</topic><topic>Laboratories</topic><topic>Linear discriminant analysis</topic><topic>speaker classification</topic><topic>speech processing</topic><topic>Telephony</topic><topic>Testing</topic><toplevel>online_resources</toplevel><creatorcontrib>Metze, F.</creatorcontrib><creatorcontrib>Ajmera, J.</creatorcontrib><creatorcontrib>Englert, R.</creatorcontrib><creatorcontrib>Bub, U.</creatorcontrib><creatorcontrib>Burkhardt, F.</creatorcontrib><creatorcontrib>Stegmann, J.</creatorcontrib><creatorcontrib>Muller, C.</creatorcontrib><creatorcontrib>Huber, R.</creatorcontrib><creatorcontrib>Andrassy, B.</creatorcontrib><creatorcontrib>Bauer, J. G.</creatorcontrib><creatorcontrib>Littel, B.</creatorcontrib><collection>IEEE Electronic Library (IEL) Conference Proceedings</collection><collection>IEEE Proceedings Order Plan (POP) 1998-present by volume</collection><collection>IEEE Xplore All Conference Proceedings</collection><collection>IEEE Xplore</collection><collection>IEEE Proceedings Order Plans (POP) 1998-present</collection></facets><delivery><delcategory>Remote Search Resource</delcategory><fulltext>fulltext_linktorsrc</fulltext></delivery><addata><au>Metze, F.</au><au>Ajmera, J.</au><au>Englert, R.</au><au>Bub, U.</au><au>Burkhardt, F.</au><au>Stegmann, J.</au><au>Muller, C.</au><au>Huber, R.</au><au>Andrassy, B.</au><au>Bauer, J. G.</au><au>Littel, B.</au><format>book</format><genre>proceeding</genre><ristype>CONF</ristype><atitle>Comparison of Four Approaches to Age and Gender Recognition for Telephone Applications</atitle><btitle>2007 IEEE International Conference on Acoustics, Speech and Signal Processing - ICASSP '07</btitle><stitle>ICASSP</stitle><date>2007-01-01</date><risdate>2007</risdate><volume>4</volume><spage>IV-1089</spage><epage>IV-1092</epage><pages>IV-1089-IV-1092</pages><issn>1520-6149</issn><eissn>2379-190X</eissn><isbn>9781424407279</isbn><isbn>1424407273</isbn><eisbn>9781424407286</eisbn><eisbn>1424407281</eisbn><abstract>This paper presents a comparative study of four different approaches to automatic age and gender classification using seven classes on a telephony speech task and also compares the results with human performance on the same data. The automatic approaches compared are based on (1) a parallel phone recognizer, derived from an automatic language identification system; (2) a system using dynamic Bayesian networks to combine several prosodic features; (3) a system based solely on linear prediction analysis; and (4) Gaussian mixture models based on MFCCs for separate recognition of age and gender. On average, the parallel phone recognizer performs as well as Human listeners do, while loosing performance on short utterances. The system based on prosodic features however shows very little dependence on the length of the utterance.</abstract><pub>IEEE</pub><doi>10.1109/ICASSP.2007.367263</doi><oa>free_for_read</oa></addata></record> |
fulltext | fulltext_linktorsrc |
identifier | ISSN: 1520-6149 |
ispartof | 2007 IEEE International Conference on Acoustics, Speech and Signal Processing - ICASSP '07, 2007, Vol.4, p.IV-1089-IV-1092 |
issn | 1520-6149 2379-190X |
language | eng |
recordid | cdi_ieee_primary_4218294 |
source | IEEE Electronic Library (IEL) Conference Proceedings |
subjects | acoustic signal analysis age Application specific integrated circuits Automatic speech recognition Bayesian methods gender Hidden Markov models Humans Laboratories Linear discriminant analysis speaker classification speech processing Telephony Testing |
title | Comparison of Four Approaches to Age and Gender Recognition for Telephone Applications |
url | http://sfxeu10.hosted.exlibrisgroup.com/loughborough?ctx_ver=Z39.88-2004&ctx_enc=info:ofi/enc:UTF-8&ctx_tim=2025-01-03T00%3A54%3A48IST&url_ver=Z39.88-2004&url_ctx_fmt=infofi/fmt:kev:mtx:ctx&rfr_id=info:sid/primo.exlibrisgroup.com:primo3-Article-ieee_6IE&rft_val_fmt=info:ofi/fmt:kev:mtx:book&rft.genre=proceeding&rft.atitle=Comparison%20of%20Four%20Approaches%20to%20Age%20and%20Gender%20Recognition%20for%20Telephone%20Applications&rft.btitle=2007%20IEEE%20International%20Conference%20on%20Acoustics,%20Speech%20and%20Signal%20Processing%20-%20ICASSP%20'07&rft.au=Metze,%20F.&rft.date=2007-01-01&rft.volume=4&rft.spage=IV-1089&rft.epage=IV-1092&rft.pages=IV-1089-IV-1092&rft.issn=1520-6149&rft.eissn=2379-190X&rft.isbn=9781424407279&rft.isbn_list=1424407273&rft_id=info:doi/10.1109/ICASSP.2007.367263&rft.eisbn=9781424407286&rft.eisbn_list=1424407281&rft_dat=%3Cieee_6IE%3E4218294%3C/ieee_6IE%3E%3Cgrp_id%3Ecdi_FETCH-LOGICAL-i219t-bde79a77dbda9fa18c963b8682640ec27d3d1c45ef477d9bf248d751af6151873%3C/grp_id%3E%3Coa%3E%3C/oa%3E%3Curl%3E%3C/url%3E&rft_id=info:oai/&rft_id=info:pmid/&rft_ieee_id=4218294&rfr_iscdi=true |