Loading…

Comparison of Four Approaches to Age and Gender Recognition for Telephone Applications

This paper presents a comparative study of four different approaches to automatic age and gender classification using seven classes on a telephony speech task and also compares the results with human performance on the same data. The automatic approaches compared are based on (1) a parallel phone re...

Full description

Saved in:
Bibliographic Details
Main Authors: Metze, F., Ajmera, J., Englert, R., Bub, U., Burkhardt, F., Stegmann, J., Muller, C., Huber, R., Andrassy, B., Bauer, J. G., Littel, B.
Format: Conference Proceeding
Language:English
Subjects:
Online Access:Request full text
Tags: Add Tag
No Tags, Be the first to tag this record!
cited_by
cites
container_end_page IV-1092
container_issue
container_start_page IV-1089
container_title
container_volume 4
creator Metze, F.
Ajmera, J.
Englert, R.
Bub, U.
Burkhardt, F.
Stegmann, J.
Muller, C.
Huber, R.
Andrassy, B.
Bauer, J. G.
Littel, B.
description This paper presents a comparative study of four different approaches to automatic age and gender classification using seven classes on a telephony speech task and also compares the results with human performance on the same data. The automatic approaches compared are based on (1) a parallel phone recognizer, derived from an automatic language identification system; (2) a system using dynamic Bayesian networks to combine several prosodic features; (3) a system based solely on linear prediction analysis; and (4) Gaussian mixture models based on MFCCs for separate recognition of age and gender. On average, the parallel phone recognizer performs as well as Human listeners do, while loosing performance on short utterances. The system based on prosodic features however shows very little dependence on the length of the utterance.
doi_str_mv 10.1109/ICASSP.2007.367263
format conference_proceeding
fullrecord <record><control><sourceid>ieee_6IE</sourceid><recordid>TN_cdi_ieee_primary_4218294</recordid><sourceformat>XML</sourceformat><sourcesystem>PC</sourcesystem><ieee_id>4218294</ieee_id><sourcerecordid>4218294</sourcerecordid><originalsourceid>FETCH-LOGICAL-i219t-bde79a77dbda9fa18c963b8682640ec27d3d1c45ef477d9bf248d751af6151873</originalsourceid><addsrcrecordid>eNpVj8tOAjEYhestEZEX0E1fYLB_2-llSYigCYlG0LgjnelfqIHppIML3t4hunF1Ft_5TnIIuQM2BmD24Xk6WS5fx5wxPRZKcyXOyMhqA5JLyTQ36pwMuNC2AMs-L_4xbS_JAErOCgXSXpObrvtijBktzYB8TNO-dTl2qaEp0Fn6znTStjm5eosdPSQ62SB1jadzbDxm-oZ12jTxEHshpExXuMN2mxo8abtYuxPpbslVcLsOR385JO-zx9X0qVi8zPsviyJysIei8qit09pX3tngwNRWicoow5VkWHPthYdalhhkX7JV4NJ4XYILCkowWgzJ_e9uRMR1m-Pe5eNacjDcSvEDxjBWvg</addsrcrecordid><sourcetype>Publisher</sourcetype><iscdi>true</iscdi><recordtype>conference_proceeding</recordtype></control><display><type>conference_proceeding</type><title>Comparison of Four Approaches to Age and Gender Recognition for Telephone Applications</title><source>IEEE Electronic Library (IEL) Conference Proceedings</source><creator>Metze, F. ; Ajmera, J. ; Englert, R. ; Bub, U. ; Burkhardt, F. ; Stegmann, J. ; Muller, C. ; Huber, R. ; Andrassy, B. ; Bauer, J. G. ; Littel, B.</creator><creatorcontrib>Metze, F. ; Ajmera, J. ; Englert, R. ; Bub, U. ; Burkhardt, F. ; Stegmann, J. ; Muller, C. ; Huber, R. ; Andrassy, B. ; Bauer, J. G. ; Littel, B.</creatorcontrib><description>This paper presents a comparative study of four different approaches to automatic age and gender classification using seven classes on a telephony speech task and also compares the results with human performance on the same data. The automatic approaches compared are based on (1) a parallel phone recognizer, derived from an automatic language identification system; (2) a system using dynamic Bayesian networks to combine several prosodic features; (3) a system based solely on linear prediction analysis; and (4) Gaussian mixture models based on MFCCs for separate recognition of age and gender. On average, the parallel phone recognizer performs as well as Human listeners do, while loosing performance on short utterances. The system based on prosodic features however shows very little dependence on the length of the utterance.</description><identifier>ISSN: 1520-6149</identifier><identifier>ISBN: 9781424407279</identifier><identifier>ISBN: 1424407273</identifier><identifier>EISSN: 2379-190X</identifier><identifier>EISBN: 9781424407286</identifier><identifier>EISBN: 1424407281</identifier><identifier>DOI: 10.1109/ICASSP.2007.367263</identifier><language>eng</language><publisher>IEEE</publisher><subject>acoustic signal analysis ; age ; Application specific integrated circuits ; Automatic speech recognition ; Bayesian methods ; gender ; Hidden Markov models ; Humans ; Laboratories ; Linear discriminant analysis ; speaker classification ; speech processing ; Telephony ; Testing</subject><ispartof>2007 IEEE International Conference on Acoustics, Speech and Signal Processing - ICASSP '07, 2007, Vol.4, p.IV-1089-IV-1092</ispartof><oa>free_for_read</oa><woscitedreferencessubscribed>false</woscitedreferencessubscribed></display><links><openurl>$$Topenurl_article</openurl><openurlfulltext>$$Topenurlfull_article</openurlfulltext><thumbnail>$$Tsyndetics_thumb_exl</thumbnail><linktohtml>$$Uhttps://ieeexplore.ieee.org/document/4218294$$EHTML$$P50$$Gieee$$H</linktohtml><link.rule.ids>309,310,780,784,789,790,2058,27925,54555,54920,54932</link.rule.ids><linktorsrc>$$Uhttps://ieeexplore.ieee.org/document/4218294$$EView_record_in_IEEE$$FView_record_in_$$GIEEE</linktorsrc></links><search><creatorcontrib>Metze, F.</creatorcontrib><creatorcontrib>Ajmera, J.</creatorcontrib><creatorcontrib>Englert, R.</creatorcontrib><creatorcontrib>Bub, U.</creatorcontrib><creatorcontrib>Burkhardt, F.</creatorcontrib><creatorcontrib>Stegmann, J.</creatorcontrib><creatorcontrib>Muller, C.</creatorcontrib><creatorcontrib>Huber, R.</creatorcontrib><creatorcontrib>Andrassy, B.</creatorcontrib><creatorcontrib>Bauer, J. G.</creatorcontrib><creatorcontrib>Littel, B.</creatorcontrib><title>Comparison of Four Approaches to Age and Gender Recognition for Telephone Applications</title><title>2007 IEEE International Conference on Acoustics, Speech and Signal Processing - ICASSP '07</title><addtitle>ICASSP</addtitle><description>This paper presents a comparative study of four different approaches to automatic age and gender classification using seven classes on a telephony speech task and also compares the results with human performance on the same data. The automatic approaches compared are based on (1) a parallel phone recognizer, derived from an automatic language identification system; (2) a system using dynamic Bayesian networks to combine several prosodic features; (3) a system based solely on linear prediction analysis; and (4) Gaussian mixture models based on MFCCs for separate recognition of age and gender. On average, the parallel phone recognizer performs as well as Human listeners do, while loosing performance on short utterances. The system based on prosodic features however shows very little dependence on the length of the utterance.</description><subject>acoustic signal analysis</subject><subject>age</subject><subject>Application specific integrated circuits</subject><subject>Automatic speech recognition</subject><subject>Bayesian methods</subject><subject>gender</subject><subject>Hidden Markov models</subject><subject>Humans</subject><subject>Laboratories</subject><subject>Linear discriminant analysis</subject><subject>speaker classification</subject><subject>speech processing</subject><subject>Telephony</subject><subject>Testing</subject><issn>1520-6149</issn><issn>2379-190X</issn><isbn>9781424407279</isbn><isbn>1424407273</isbn><isbn>9781424407286</isbn><isbn>1424407281</isbn><fulltext>true</fulltext><rsrctype>conference_proceeding</rsrctype><creationdate>2007</creationdate><recordtype>conference_proceeding</recordtype><sourceid>6IE</sourceid><recordid>eNpVj8tOAjEYhestEZEX0E1fYLB_2-llSYigCYlG0LgjnelfqIHppIML3t4hunF1Ft_5TnIIuQM2BmD24Xk6WS5fx5wxPRZKcyXOyMhqA5JLyTQ36pwMuNC2AMs-L_4xbS_JAErOCgXSXpObrvtijBktzYB8TNO-dTl2qaEp0Fn6znTStjm5eosdPSQ62SB1jadzbDxm-oZ12jTxEHshpExXuMN2mxo8abtYuxPpbslVcLsOR385JO-zx9X0qVi8zPsviyJysIei8qit09pX3tngwNRWicoow5VkWHPthYdalhhkX7JV4NJ4XYILCkowWgzJ_e9uRMR1m-Pe5eNacjDcSvEDxjBWvg</recordid><startdate>20070101</startdate><enddate>20070101</enddate><creator>Metze, F.</creator><creator>Ajmera, J.</creator><creator>Englert, R.</creator><creator>Bub, U.</creator><creator>Burkhardt, F.</creator><creator>Stegmann, J.</creator><creator>Muller, C.</creator><creator>Huber, R.</creator><creator>Andrassy, B.</creator><creator>Bauer, J. G.</creator><creator>Littel, B.</creator><general>IEEE</general><scope>6IE</scope><scope>6IH</scope><scope>CBEJK</scope><scope>RIE</scope><scope>RIO</scope></search><sort><creationdate>20070101</creationdate><title>Comparison of Four Approaches to Age and Gender Recognition for Telephone Applications</title><author>Metze, F. ; Ajmera, J. ; Englert, R. ; Bub, U. ; Burkhardt, F. ; Stegmann, J. ; Muller, C. ; Huber, R. ; Andrassy, B. ; Bauer, J. G. ; Littel, B.</author></sort><facets><frbrtype>5</frbrtype><frbrgroupid>cdi_FETCH-LOGICAL-i219t-bde79a77dbda9fa18c963b8682640ec27d3d1c45ef477d9bf248d751af6151873</frbrgroupid><rsrctype>conference_proceedings</rsrctype><prefilter>conference_proceedings</prefilter><language>eng</language><creationdate>2007</creationdate><topic>acoustic signal analysis</topic><topic>age</topic><topic>Application specific integrated circuits</topic><topic>Automatic speech recognition</topic><topic>Bayesian methods</topic><topic>gender</topic><topic>Hidden Markov models</topic><topic>Humans</topic><topic>Laboratories</topic><topic>Linear discriminant analysis</topic><topic>speaker classification</topic><topic>speech processing</topic><topic>Telephony</topic><topic>Testing</topic><toplevel>online_resources</toplevel><creatorcontrib>Metze, F.</creatorcontrib><creatorcontrib>Ajmera, J.</creatorcontrib><creatorcontrib>Englert, R.</creatorcontrib><creatorcontrib>Bub, U.</creatorcontrib><creatorcontrib>Burkhardt, F.</creatorcontrib><creatorcontrib>Stegmann, J.</creatorcontrib><creatorcontrib>Muller, C.</creatorcontrib><creatorcontrib>Huber, R.</creatorcontrib><creatorcontrib>Andrassy, B.</creatorcontrib><creatorcontrib>Bauer, J. G.</creatorcontrib><creatorcontrib>Littel, B.</creatorcontrib><collection>IEEE Electronic Library (IEL) Conference Proceedings</collection><collection>IEEE Proceedings Order Plan (POP) 1998-present by volume</collection><collection>IEEE Xplore All Conference Proceedings</collection><collection>IEEE Xplore</collection><collection>IEEE Proceedings Order Plans (POP) 1998-present</collection></facets><delivery><delcategory>Remote Search Resource</delcategory><fulltext>fulltext_linktorsrc</fulltext></delivery><addata><au>Metze, F.</au><au>Ajmera, J.</au><au>Englert, R.</au><au>Bub, U.</au><au>Burkhardt, F.</au><au>Stegmann, J.</au><au>Muller, C.</au><au>Huber, R.</au><au>Andrassy, B.</au><au>Bauer, J. G.</au><au>Littel, B.</au><format>book</format><genre>proceeding</genre><ristype>CONF</ristype><atitle>Comparison of Four Approaches to Age and Gender Recognition for Telephone Applications</atitle><btitle>2007 IEEE International Conference on Acoustics, Speech and Signal Processing - ICASSP '07</btitle><stitle>ICASSP</stitle><date>2007-01-01</date><risdate>2007</risdate><volume>4</volume><spage>IV-1089</spage><epage>IV-1092</epage><pages>IV-1089-IV-1092</pages><issn>1520-6149</issn><eissn>2379-190X</eissn><isbn>9781424407279</isbn><isbn>1424407273</isbn><eisbn>9781424407286</eisbn><eisbn>1424407281</eisbn><abstract>This paper presents a comparative study of four different approaches to automatic age and gender classification using seven classes on a telephony speech task and also compares the results with human performance on the same data. The automatic approaches compared are based on (1) a parallel phone recognizer, derived from an automatic language identification system; (2) a system using dynamic Bayesian networks to combine several prosodic features; (3) a system based solely on linear prediction analysis; and (4) Gaussian mixture models based on MFCCs for separate recognition of age and gender. On average, the parallel phone recognizer performs as well as Human listeners do, while loosing performance on short utterances. The system based on prosodic features however shows very little dependence on the length of the utterance.</abstract><pub>IEEE</pub><doi>10.1109/ICASSP.2007.367263</doi><oa>free_for_read</oa></addata></record>
fulltext fulltext_linktorsrc
identifier ISSN: 1520-6149
ispartof 2007 IEEE International Conference on Acoustics, Speech and Signal Processing - ICASSP '07, 2007, Vol.4, p.IV-1089-IV-1092
issn 1520-6149
2379-190X
language eng
recordid cdi_ieee_primary_4218294
source IEEE Electronic Library (IEL) Conference Proceedings
subjects acoustic signal analysis
age
Application specific integrated circuits
Automatic speech recognition
Bayesian methods
gender
Hidden Markov models
Humans
Laboratories
Linear discriminant analysis
speaker classification
speech processing
Telephony
Testing
title Comparison of Four Approaches to Age and Gender Recognition for Telephone Applications
url http://sfxeu10.hosted.exlibrisgroup.com/loughborough?ctx_ver=Z39.88-2004&ctx_enc=info:ofi/enc:UTF-8&ctx_tim=2025-01-03T00%3A54%3A48IST&url_ver=Z39.88-2004&url_ctx_fmt=infofi/fmt:kev:mtx:ctx&rfr_id=info:sid/primo.exlibrisgroup.com:primo3-Article-ieee_6IE&rft_val_fmt=info:ofi/fmt:kev:mtx:book&rft.genre=proceeding&rft.atitle=Comparison%20of%20Four%20Approaches%20to%20Age%20and%20Gender%20Recognition%20for%20Telephone%20Applications&rft.btitle=2007%20IEEE%20International%20Conference%20on%20Acoustics,%20Speech%20and%20Signal%20Processing%20-%20ICASSP%20'07&rft.au=Metze,%20F.&rft.date=2007-01-01&rft.volume=4&rft.spage=IV-1089&rft.epage=IV-1092&rft.pages=IV-1089-IV-1092&rft.issn=1520-6149&rft.eissn=2379-190X&rft.isbn=9781424407279&rft.isbn_list=1424407273&rft_id=info:doi/10.1109/ICASSP.2007.367263&rft.eisbn=9781424407286&rft.eisbn_list=1424407281&rft_dat=%3Cieee_6IE%3E4218294%3C/ieee_6IE%3E%3Cgrp_id%3Ecdi_FETCH-LOGICAL-i219t-bde79a77dbda9fa18c963b8682640ec27d3d1c45ef477d9bf248d751af6151873%3C/grp_id%3E%3Coa%3E%3C/oa%3E%3Curl%3E%3C/url%3E&rft_id=info:oai/&rft_id=info:pmid/&rft_ieee_id=4218294&rfr_iscdi=true