Loading…

Incorporating Model-Specific Score Distribution in Speaker Verification Systems

It has been shown that the authentication performance of a biometric system is dependent on the models/templates specific to a user. As a result, some users may be more easily recognized or impersonated than others. The various categories of users have been characterized by Doddington et al . (1988)...

Full description

Saved in:

Bibliographic Details
Published in:	IEEE transactions on audio, speech, and language processing speech, and language processing, 2008-03, Vol.16 (3), p.594-606
Main Authors:	Poh, N., Kittler, J.
Format:	Article
Language:	English
Subjects:	Animal behavior Applied sciences Authentication Biomedical signal processing Biometric authentication Biometrics Categories Classifiers Context modeling Cryptography Exact sciences and technology Feeds Information, signal and communications theory Lamb Natural language processing Pattern recognition Recognition score normalization Signal and communications theory Signal processing Signal representation. Spectral analysis Signal, noise Speech processing Statistics Telecommunications and information theory Training data
Citations:	Items that this one cites Items that cite this one
Online Access:	Get full text
Tags:	Add Tag No Tags, Be the first to tag this record!

cited_by	cdi_FETCH-LOGICAL-c384t-8148f8c5cdb7000d560089ff36e33a4d4398c7da08fbf873c6e68c2a176005c13
cites	cdi_FETCH-LOGICAL-c384t-8148f8c5cdb7000d560089ff36e33a4d4398c7da08fbf873c6e68c2a176005c13
container_end_page	606
container_issue	3
container_start_page	594
container_title	IEEE transactions on audio, speech, and language processing
container_volume	16
creator	Poh, N. Kittler, J.
description	It has been shown that the authentication performance of a biometric system is dependent on the models/templates specific to a user. As a result, some users may be more easily recognized or impersonated than others. The various categories of users have been characterized by Doddington et al . (1988). We refer to this unbalanced performance across users as the Doddington's zoo effect. In the context of fusion, we argue that this effect is system-dependent, i.e., a user model that is easily impersonated (a lamb) in one system may be easily recognized in another system (a sheep). While in principle, a fusion system could be trained to cope with the changing animal behavior of users from system to system, the lack of training data makes it impossible. We believe that one major cause of the Doddington's zoo effect is the variation of class conditional scores from one speaker model to another. We propose a two-level fusion framework that effectively realizes a fusion classifier adapted to each user. First, one applies a client-specific (or model-specific) score normalization procedure to each of the system outputs to be combined. Then, one feeds the resulting normalized outputs to a fusion classifier (common to all users) as input to obtain a final combined score. Two existing model-specific score normalization procedures are considered in this framework, i.e., F- and Z-norms. In addition to them, a novel score normalization method called model-specific log-likelihood ratio (MS-LLR) is also proposed. While Z-norm is impostor-centric, i.e., it makes use of only the impostor score statistics, F-norm and the proposed MS-LLR are client-impostor centric, i.e., they consider both the client and impostor score statistics simultaneously. Our findings based on the XM2VTS and the NIST2005 databases show that when client-impostor centric normalization procedures are used to implement the proposed two-level fusion framework, the resulting fusion classifier outperforms the conventional fusion classifier (without applying any user-specific score normalization) in the majority of experiments.
doi_str_mv	10.1109/TASL.2008.916525
format	article
fullrecord	<record><control><sourceid>proquest_ieee_</sourceid><recordid>TN_cdi_proquest_miscellaneous_1671334391</recordid><sourceformat>XML</sourceformat><sourcesystem>PC</sourcesystem><ieee_id>4451150</ieee_id><sourcerecordid>2568775391</sourcerecordid><originalsourceid>FETCH-LOGICAL-c384t-8148f8c5cdb7000d560089ff36e33a4d4398c7da08fbf873c6e68c2a176005c13</originalsourceid><addsrcrecordid>eNp9kc1LwzAYxoMoOKd3wUsRFC-dSfPR9Djm12CyQ6fXkKWJZHbNTNrD_ntTKzt4EAIJ7_N7Ht7wAHCJ4AQhWNyvpuVikkHIJwViNKNHYIQo5WleZOT48EbsFJyFsIGQYEbQCCznjXJ-57xsbfORvLpK12m508oaq5Iyajp5sKH1dt211jWJbZIoy0_tk3fte0r-zMt9aPU2nIMTI-ugL37vMXh7elzNXtLF8nk-my5ShTlpU44IN1xRVa1zCGFFWdy8MAYzjbEkFcEFV3klITdrw3OsmGZcZRLlEaQK4TG4HXJ33n11OrRia4PSdS0b7bogMMOYMMIiePcviFiOIoqLPvP6D7pxnW_iN0SBcgLjgRGCA6S8C8FrI3bebqXfCwRF34TomxB9E2JoIlpufnNlULI2XjbKhoOvRzHkffTVwFmt9UEmhCJEIf4GeSOQWQ</addsrcrecordid><sourcetype>Aggregation Database</sourcetype><iscdi>true</iscdi><recordtype>article</recordtype><pqid>917407400</pqid></control><display><type>article</type><title>Incorporating Model-Specific Score Distribution in Speaker Verification Systems</title><source>IEEE Xplore (Online service)</source><creator>Poh, N. ; Kittler, J.</creator><creatorcontrib>Poh, N. ; Kittler, J.</creatorcontrib><description>It has been shown that the authentication performance of a biometric system is dependent on the models/templates specific to a user. As a result, some users may be more easily recognized or impersonated than others. The various categories of users have been characterized by Doddington et al . (1988). We refer to this unbalanced performance across users as the Doddington's zoo effect. In the context of fusion, we argue that this effect is system-dependent, i.e., a user model that is easily impersonated (a lamb) in one system may be easily recognized in another system (a sheep). While in principle, a fusion system could be trained to cope with the changing animal behavior of users from system to system, the lack of training data makes it impossible. We believe that one major cause of the Doddington's zoo effect is the variation of class conditional scores from one speaker model to another. We propose a two-level fusion framework that effectively realizes a fusion classifier adapted to each user. First, one applies a client-specific (or model-specific) score normalization procedure to each of the system outputs to be combined. Then, one feeds the resulting normalized outputs to a fusion classifier (common to all users) as input to obtain a final combined score. Two existing model-specific score normalization procedures are considered in this framework, i.e., F- and Z-norms. In addition to them, a novel score normalization method called model-specific log-likelihood ratio (MS-LLR) is also proposed. While Z-norm is impostor-centric, i.e., it makes use of only the impostor score statistics, F-norm and the proposed MS-LLR are client-impostor centric, i.e., they consider both the client and impostor score statistics simultaneously. Our findings based on the XM2VTS and the NIST2005 databases show that when client-impostor centric normalization procedures are used to implement the proposed two-level fusion framework, the resulting fusion classifier outperforms the conventional fusion classifier (without applying any user-specific score normalization) in the majority of experiments.</description><identifier>ISSN: 1558-7916</identifier><identifier>ISSN: 2329-9290</identifier><identifier>EISSN: 1558-7924</identifier><identifier>EISSN: 2329-9304</identifier><identifier>DOI: 10.1109/TASL.2008.916525</identifier><identifier>CODEN: ITASD8</identifier><language>eng</language><publisher>Piscataway, NJ: IEEE</publisher><subject>Animal behavior ; Applied sciences ; Authentication ; Biomedical signal processing ; Biometric authentication ; Biometrics ; Categories ; Classifiers ; Context modeling ; Cryptography ; Exact sciences and technology ; Feeds ; Information, signal and communications theory ; Lamb ; Natural language processing ; Pattern recognition ; Recognition ; score normalization ; Signal and communications theory ; Signal processing ; Signal representation. Spectral analysis ; Signal, noise ; Speech processing ; Statistics ; Telecommunications and information theory ; Training data</subject><ispartof>IEEE transactions on audio, speech, and language processing, 2008-03, Vol.16 (3), p.594-606</ispartof><rights>2008 INIST-CNRS</rights><rights>Copyright The Institute of Electrical and Electronics Engineers, Inc. (IEEE) 2008</rights><lds50>peer_reviewed</lds50><woscitedreferencessubscribed>false</woscitedreferencessubscribed><citedby>FETCH-LOGICAL-c384t-8148f8c5cdb7000d560089ff36e33a4d4398c7da08fbf873c6e68c2a176005c13</citedby><cites>FETCH-LOGICAL-c384t-8148f8c5cdb7000d560089ff36e33a4d4398c7da08fbf873c6e68c2a176005c13</cites></display><links><openurl>$$Topenurl_article</openurl><openurlfulltext>$$Topenurlfull_article</openurlfulltext><thumbnail>$$Tsyndetics_thumb_exl</thumbnail><linktohtml>$$Uhttps://ieeexplore.ieee.org/document/4451150$$EHTML$$P50$$Gieee$$H</linktohtml><link.rule.ids>314,780,784,27924,27925,54796</link.rule.ids><backlink>$$Uhttp://pascal-francis.inist.fr/vibad/index.php?action=getRecordDetail&idt=20083080$$DView record in Pascal Francis$$Hfree_for_read</backlink></links><search><creatorcontrib>Poh, N.</creatorcontrib><creatorcontrib>Kittler, J.</creatorcontrib><title>Incorporating Model-Specific Score Distribution in Speaker Verification Systems</title><title>IEEE transactions on audio, speech, and language processing</title><addtitle>TASL</addtitle><description>It has been shown that the authentication performance of a biometric system is dependent on the models/templates specific to a user. As a result, some users may be more easily recognized or impersonated than others. The various categories of users have been characterized by Doddington et al . (1988). We refer to this unbalanced performance across users as the Doddington's zoo effect. In the context of fusion, we argue that this effect is system-dependent, i.e., a user model that is easily impersonated (a lamb) in one system may be easily recognized in another system (a sheep). While in principle, a fusion system could be trained to cope with the changing animal behavior of users from system to system, the lack of training data makes it impossible. We believe that one major cause of the Doddington's zoo effect is the variation of class conditional scores from one speaker model to another. We propose a two-level fusion framework that effectively realizes a fusion classifier adapted to each user. First, one applies a client-specific (or model-specific) score normalization procedure to each of the system outputs to be combined. Then, one feeds the resulting normalized outputs to a fusion classifier (common to all users) as input to obtain a final combined score. Two existing model-specific score normalization procedures are considered in this framework, i.e., F- and Z-norms. In addition to them, a novel score normalization method called model-specific log-likelihood ratio (MS-LLR) is also proposed. While Z-norm is impostor-centric, i.e., it makes use of only the impostor score statistics, F-norm and the proposed MS-LLR are client-impostor centric, i.e., they consider both the client and impostor score statistics simultaneously. Our findings based on the XM2VTS and the NIST2005 databases show that when client-impostor centric normalization procedures are used to implement the proposed two-level fusion framework, the resulting fusion classifier outperforms the conventional fusion classifier (without applying any user-specific score normalization) in the majority of experiments.</description><subject>Animal behavior</subject><subject>Applied sciences</subject><subject>Authentication</subject><subject>Biomedical signal processing</subject><subject>Biometric authentication</subject><subject>Biometrics</subject><subject>Categories</subject><subject>Classifiers</subject><subject>Context modeling</subject><subject>Cryptography</subject><subject>Exact sciences and technology</subject><subject>Feeds</subject><subject>Information, signal and communications theory</subject><subject>Lamb</subject><subject>Natural language processing</subject><subject>Pattern recognition</subject><subject>Recognition</subject><subject>score normalization</subject><subject>Signal and communications theory</subject><subject>Signal processing</subject><subject>Signal representation. Spectral analysis</subject><subject>Signal, noise</subject><subject>Speech processing</subject><subject>Statistics</subject><subject>Telecommunications and information theory</subject><subject>Training data</subject><issn>1558-7916</issn><issn>2329-9290</issn><issn>1558-7924</issn><issn>2329-9304</issn><fulltext>true</fulltext><rsrctype>article</rsrctype><creationdate>2008</creationdate><recordtype>article</recordtype><recordid>eNp9kc1LwzAYxoMoOKd3wUsRFC-dSfPR9Djm12CyQ6fXkKWJZHbNTNrD_ntTKzt4EAIJ7_N7Ht7wAHCJ4AQhWNyvpuVikkHIJwViNKNHYIQo5WleZOT48EbsFJyFsIGQYEbQCCznjXJ-57xsbfORvLpK12m508oaq5Iyajp5sKH1dt211jWJbZIoy0_tk3fte0r-zMt9aPU2nIMTI-ugL37vMXh7elzNXtLF8nk-my5ShTlpU44IN1xRVa1zCGFFWdy8MAYzjbEkFcEFV3klITdrw3OsmGZcZRLlEaQK4TG4HXJ33n11OrRia4PSdS0b7bogMMOYMMIiePcviFiOIoqLPvP6D7pxnW_iN0SBcgLjgRGCA6S8C8FrI3bebqXfCwRF34TomxB9E2JoIlpufnNlULI2XjbKhoOvRzHkffTVwFmt9UEmhCJEIf4GeSOQWQ</recordid><startdate>20080301</startdate><enddate>20080301</enddate><creator>Poh, N.</creator><creator>Kittler, J.</creator><general>IEEE</general><general>Institute of Electrical and Electronics Engineers</general><general>The Institute of Electrical and Electronics Engineers, Inc. (IEEE)</general><scope>97E</scope><scope>RIA</scope><scope>RIE</scope><scope>IQODW</scope><scope>AAYXX</scope><scope>CITATION</scope><scope>7SC</scope><scope>8FD</scope><scope>JQ2</scope><scope>L7M</scope><scope>L~C</scope><scope>L~D</scope></search><sort><creationdate>20080301</creationdate><title>Incorporating Model-Specific Score Distribution in Speaker Verification Systems</title><author>Poh, N. ; Kittler, J.</author></sort><facets><frbrtype>5</frbrtype><frbrgroupid>cdi_FETCH-LOGICAL-c384t-8148f8c5cdb7000d560089ff36e33a4d4398c7da08fbf873c6e68c2a176005c13</frbrgroupid><rsrctype>articles</rsrctype><prefilter>articles</prefilter><language>eng</language><creationdate>2008</creationdate><topic>Animal behavior</topic><topic>Applied sciences</topic><topic>Authentication</topic><topic>Biomedical signal processing</topic><topic>Biometric authentication</topic><topic>Biometrics</topic><topic>Categories</topic><topic>Classifiers</topic><topic>Context modeling</topic><topic>Cryptography</topic><topic>Exact sciences and technology</topic><topic>Feeds</topic><topic>Information, signal and communications theory</topic><topic>Lamb</topic><topic>Natural language processing</topic><topic>Pattern recognition</topic><topic>Recognition</topic><topic>score normalization</topic><topic>Signal and communications theory</topic><topic>Signal processing</topic><topic>Signal representation. Spectral analysis</topic><topic>Signal, noise</topic><topic>Speech processing</topic><topic>Statistics</topic><topic>Telecommunications and information theory</topic><topic>Training data</topic><toplevel>peer_reviewed</toplevel><toplevel>online_resources</toplevel><creatorcontrib>Poh, N.</creatorcontrib><creatorcontrib>Kittler, J.</creatorcontrib><collection>IEEE All-Society Periodicals Package (ASPP) 2005-present</collection><collection>IEEE All-Society Periodicals Package (ASPP) 1998-Present</collection><collection>IEEE Xplore (Online service)</collection><collection>Pascal-Francis</collection><collection>CrossRef</collection><collection>Computer and Information Systems Abstracts</collection><collection>Technology Research Database</collection><collection>ProQuest Computer Science Collection</collection><collection>Advanced Technologies Database with Aerospace</collection><collection>Computer and Information Systems Abstracts Academic</collection><collection>Computer and Information Systems Abstracts Professional</collection><jtitle>IEEE transactions on audio, speech, and language processing</jtitle></facets><delivery><delcategory>Remote Search Resource</delcategory><fulltext>fulltext</fulltext></delivery><addata><au>Poh, N.</au><au>Kittler, J.</au><format>journal</format><genre>article</genre><ristype>JOUR</ristype><atitle>Incorporating Model-Specific Score Distribution in Speaker Verification Systems</atitle><jtitle>IEEE transactions on audio, speech, and language processing</jtitle><stitle>TASL</stitle><date>2008-03-01</date><risdate>2008</risdate><volume>16</volume><issue>3</issue><spage>594</spage><epage>606</epage><pages>594-606</pages><issn>1558-7916</issn><issn>2329-9290</issn><eissn>1558-7924</eissn><eissn>2329-9304</eissn><coden>ITASD8</coden><abstract>It has been shown that the authentication performance of a biometric system is dependent on the models/templates specific to a user. As a result, some users may be more easily recognized or impersonated than others. The various categories of users have been characterized by Doddington et al . (1988). We refer to this unbalanced performance across users as the Doddington's zoo effect. In the context of fusion, we argue that this effect is system-dependent, i.e., a user model that is easily impersonated (a lamb) in one system may be easily recognized in another system (a sheep). While in principle, a fusion system could be trained to cope with the changing animal behavior of users from system to system, the lack of training data makes it impossible. We believe that one major cause of the Doddington's zoo effect is the variation of class conditional scores from one speaker model to another. We propose a two-level fusion framework that effectively realizes a fusion classifier adapted to each user. First, one applies a client-specific (or model-specific) score normalization procedure to each of the system outputs to be combined. Then, one feeds the resulting normalized outputs to a fusion classifier (common to all users) as input to obtain a final combined score. Two existing model-specific score normalization procedures are considered in this framework, i.e., F- and Z-norms. In addition to them, a novel score normalization method called model-specific log-likelihood ratio (MS-LLR) is also proposed. While Z-norm is impostor-centric, i.e., it makes use of only the impostor score statistics, F-norm and the proposed MS-LLR are client-impostor centric, i.e., they consider both the client and impostor score statistics simultaneously. Our findings based on the XM2VTS and the NIST2005 databases show that when client-impostor centric normalization procedures are used to implement the proposed two-level fusion framework, the resulting fusion classifier outperforms the conventional fusion classifier (without applying any user-specific score normalization) in the majority of experiments.</abstract><cop>Piscataway, NJ</cop><pub>IEEE</pub><doi>10.1109/TASL.2008.916525</doi><tpages>13</tpages></addata></record>
fulltext	fulltext
identifier	ISSN: 1558-7916
ispartof	IEEE transactions on audio, speech, and language processing, 2008-03, Vol.16 (3), p.594-606
issn	1558-7916 2329-9290 1558-7924 2329-9304
language	eng
recordid	cdi_proquest_miscellaneous_1671334391
source	IEEE Xplore (Online service)
subjects	Animal behavior Applied sciences Authentication Biomedical signal processing Biometric authentication Biometrics Categories Classifiers Context modeling Cryptography Exact sciences and technology Feeds Information, signal and communications theory Lamb Natural language processing Pattern recognition Recognition score normalization Signal and communications theory Signal processing Signal representation. Spectral analysis Signal, noise Speech processing Statistics Telecommunications and information theory Training data
title	Incorporating Model-Specific Score Distribution in Speaker Verification Systems
url	http://sfxeu10.hosted.exlibrisgroup.com/loughborough?ctx_ver=Z39.88-2004&ctx_enc=info:ofi/enc:UTF-8&ctx_tim=2024-12-27T12%3A26%3A59IST&url_ver=Z39.88-2004&url_ctx_fmt=infofi/fmt:kev:mtx:ctx&rfr_id=info:sid/primo.exlibrisgroup.com:primo3-Article-proquest_ieee_&rft_val_fmt=info:ofi/fmt:kev:mtx:journal&rft.genre=article&rft.atitle=Incorporating%20Model-Specific%20Score%20Distribution%20in%20Speaker%20Verification%20Systems&rft.jtitle=IEEE%20transactions%20on%20audio,%20speech,%20and%20language%20processing&rft.au=Poh,%20N.&rft.date=2008-03-01&rft.volume=16&rft.issue=3&rft.spage=594&rft.epage=606&rft.pages=594-606&rft.issn=1558-7916&rft.eissn=1558-7924&rft.coden=ITASD8&rft_id=info:doi/10.1109/TASL.2008.916525&rft_dat=%3Cproquest_ieee_%3E2568775391%3C/proquest_ieee_%3E%3Cgrp_id%3Ecdi_FETCH-LOGICAL-c384t-8148f8c5cdb7000d560089ff36e33a4d4398c7da08fbf873c6e68c2a176005c13%3C/grp_id%3E%3Coa%3E%3C/oa%3E%3Curl%3E%3C/url%3E&rft_id=info:oai/&rft_pqid=917407400&rft_id=info:pmid/&rft_ieee_id=4451150&rfr_iscdi=true