Loading…

Faceptor: A Generalist Model for Face Perception

With the comprehensive research conducted on various face analysis tasks, there is a growing interest among researchers to develop a unified approach to face perception. Existing methods mainly discuss unified representation and training, which lack task extensibility and application efficiency. To...

Full description

Saved in:

Bibliographic Details
Published in:	arXiv.org 2024-03
Main Authors:	Qin, Lixiong, Wang, Mei, Liu, Xuannan, Zhang, Yuhang, Deng, Wei, Song, Xiaoshuai, Xu, Weiran, Deng, Weihong
Format:	Article
Language:	English
Subjects:	Chronology Design standards Extensibility Face recognition Machine learning Perception Semantics Structural design Supervised learning
Online Access:	Get full text
Tags:	Add Tag No Tags, Be the first to tag this record!

cited_by
cites
container_end_page
container_issue
container_start_page
container_title	arXiv.org
container_volume
creator	Qin, Lixiong Wang, Mei Liu, Xuannan Zhang, Yuhang Deng, Wei Song, Xiaoshuai Xu, Weiran Deng, Weihong
description	With the comprehensive research conducted on various face analysis tasks, there is a growing interest among researchers to develop a unified approach to face perception. Existing methods mainly discuss unified representation and training, which lack task extensibility and application efficiency. To tackle this issue, we focus on the unified model structure, exploring a face generalist model. As an intuitive design, Naive Faceptor enables tasks with the same output shape and granularity to share the structural design of the standardized output head, achieving improved task extensibility. Furthermore, Faceptor is proposed to adopt a well-designed single-encoder dual-decoder architecture, allowing task-specific queries to represent new-coming semantics. This design enhances the unification of model structure while improving application efficiency in terms of storage overhead. Additionally, we introduce Layer-Attention into Faceptor, enabling the model to adaptively select features from optimal layers to perform the desired tasks. Through joint training on 13 face perception datasets, Faceptor achieves exceptional performance in facial landmark localization, face parsing, age estimation, expression recognition, binary attribute classification, and face recognition, achieving or surpassing specialized methods in most tasks. Our training framework can also be applied to auxiliary supervised learning, significantly improving performance in data-sparse tasks such as age estimation and expression recognition. The code and models will be made publicly available at https://github.com/lxq1000/Faceptor.
format	article
fullrecord	<record><control><sourceid>proquest</sourceid><recordid>TN_cdi_proquest_journals_2957604578</recordid><sourceformat>XML</sourceformat><sourcesystem>PC</sourcesystem><sourcerecordid>2957604578</sourcerecordid><originalsourceid>FETCH-proquest_journals_29576045783</originalsourceid><addsrcrecordid>eNpjYuA0MjY21LUwMTLiYOAtLs4yMDAwMjM3MjU15mQwcEtMTi0oyS-yUnBUcE_NSy1KzMksLlHwzU9JzVFIyy9SAClQCEgtAinLzM_jYWBNS8wpTuWF0twMym6uIc4eugVF-YWlqcUl8Vn5pUV5QKl4I0tTczMDE1NzC2PiVAEABV0yXA</addsrcrecordid><sourcetype>Aggregation Database</sourcetype><iscdi>true</iscdi><recordtype>article</recordtype><pqid>2957604578</pqid></control><display><type>article</type><title>Faceptor: A Generalist Model for Face Perception</title><source>Publicly Available Content (ProQuest)</source><creator>Qin, Lixiong ; Wang, Mei ; Liu, Xuannan ; Zhang, Yuhang ; Deng, Wei ; Song, Xiaoshuai ; Xu, Weiran ; Deng, Weihong</creator><creatorcontrib>Qin, Lixiong ; Wang, Mei ; Liu, Xuannan ; Zhang, Yuhang ; Deng, Wei ; Song, Xiaoshuai ; Xu, Weiran ; Deng, Weihong</creatorcontrib><description>With the comprehensive research conducted on various face analysis tasks, there is a growing interest among researchers to develop a unified approach to face perception. Existing methods mainly discuss unified representation and training, which lack task extensibility and application efficiency. To tackle this issue, we focus on the unified model structure, exploring a face generalist model. As an intuitive design, Naive Faceptor enables tasks with the same output shape and granularity to share the structural design of the standardized output head, achieving improved task extensibility. Furthermore, Faceptor is proposed to adopt a well-designed single-encoder dual-decoder architecture, allowing task-specific queries to represent new-coming semantics. This design enhances the unification of model structure while improving application efficiency in terms of storage overhead. Additionally, we introduce Layer-Attention into Faceptor, enabling the model to adaptively select features from optimal layers to perform the desired tasks. Through joint training on 13 face perception datasets, Faceptor achieves exceptional performance in facial landmark localization, face parsing, age estimation, expression recognition, binary attribute classification, and face recognition, achieving or surpassing specialized methods in most tasks. Our training framework can also be applied to auxiliary supervised learning, significantly improving performance in data-sparse tasks such as age estimation and expression recognition. The code and models will be made publicly available at https://github.com/lxq1000/Faceptor.</description><identifier>EISSN: 2331-8422</identifier><language>eng</language><publisher>Ithaca: Cornell University Library, arXiv.org</publisher><subject>Chronology ; Design standards ; Extensibility ; Face recognition ; Machine learning ; Perception ; Semantics ; Structural design ; Supervised learning</subject><ispartof>arXiv.org, 2024-03</ispartof><rights>2024. This work is published under http://arxiv.org/licenses/nonexclusive-distrib/1.0/ (the “License”). Notwithstanding the ProQuest Terms and Conditions, you may use this content in accordance with the terms of the License.</rights><oa>free_for_read</oa><woscitedreferencessubscribed>false</woscitedreferencessubscribed></display><links><openurl>$$Topenurl_article</openurl><openurlfulltext>$$Topenurlfull_article</openurlfulltext><thumbnail>$$Tsyndetics_thumb_exl</thumbnail><linktohtml>$$Uhttps://www.proquest.com/docview/2957604578?pq-origsite=primo$$EHTML$$P50$$Gproquest$$Hfree_for_read</linktohtml><link.rule.ids>776,780,25731,36989,44566</link.rule.ids></links><search><creatorcontrib>Qin, Lixiong</creatorcontrib><creatorcontrib>Wang, Mei</creatorcontrib><creatorcontrib>Liu, Xuannan</creatorcontrib><creatorcontrib>Zhang, Yuhang</creatorcontrib><creatorcontrib>Deng, Wei</creatorcontrib><creatorcontrib>Song, Xiaoshuai</creatorcontrib><creatorcontrib>Xu, Weiran</creatorcontrib><creatorcontrib>Deng, Weihong</creatorcontrib><title>Faceptor: A Generalist Model for Face Perception</title><title>arXiv.org</title><description>With the comprehensive research conducted on various face analysis tasks, there is a growing interest among researchers to develop a unified approach to face perception. Existing methods mainly discuss unified representation and training, which lack task extensibility and application efficiency. To tackle this issue, we focus on the unified model structure, exploring a face generalist model. As an intuitive design, Naive Faceptor enables tasks with the same output shape and granularity to share the structural design of the standardized output head, achieving improved task extensibility. Furthermore, Faceptor is proposed to adopt a well-designed single-encoder dual-decoder architecture, allowing task-specific queries to represent new-coming semantics. This design enhances the unification of model structure while improving application efficiency in terms of storage overhead. Additionally, we introduce Layer-Attention into Faceptor, enabling the model to adaptively select features from optimal layers to perform the desired tasks. Through joint training on 13 face perception datasets, Faceptor achieves exceptional performance in facial landmark localization, face parsing, age estimation, expression recognition, binary attribute classification, and face recognition, achieving or surpassing specialized methods in most tasks. Our training framework can also be applied to auxiliary supervised learning, significantly improving performance in data-sparse tasks such as age estimation and expression recognition. The code and models will be made publicly available at https://github.com/lxq1000/Faceptor.</description><subject>Chronology</subject><subject>Design standards</subject><subject>Extensibility</subject><subject>Face recognition</subject><subject>Machine learning</subject><subject>Perception</subject><subject>Semantics</subject><subject>Structural design</subject><subject>Supervised learning</subject><issn>2331-8422</issn><fulltext>true</fulltext><rsrctype>article</rsrctype><creationdate>2024</creationdate><recordtype>article</recordtype><sourceid>PIMPY</sourceid><recordid>eNpjYuA0MjY21LUwMTLiYOAtLs4yMDAwMjM3MjU15mQwcEtMTi0oyS-yUnBUcE_NSy1KzMksLlHwzU9JzVFIyy9SAClQCEgtAinLzM_jYWBNS8wpTuWF0twMym6uIc4eugVF-YWlqcUl8Vn5pUV5QKl4I0tTczMDE1NzC2PiVAEABV0yXA</recordid><startdate>20240314</startdate><enddate>20240314</enddate><creator>Qin, Lixiong</creator><creator>Wang, Mei</creator><creator>Liu, Xuannan</creator><creator>Zhang, Yuhang</creator><creator>Deng, Wei</creator><creator>Song, Xiaoshuai</creator><creator>Xu, Weiran</creator><creator>Deng, Weihong</creator><general>Cornell University Library, arXiv.org</general><scope>8FE</scope><scope>8FG</scope><scope>ABJCF</scope><scope>ABUWG</scope><scope>AFKRA</scope><scope>AZQEC</scope><scope>BENPR</scope><scope>BGLVJ</scope><scope>CCPQU</scope><scope>DWQXO</scope><scope>HCIFZ</scope><scope>L6V</scope><scope>M7S</scope><scope>PIMPY</scope><scope>PQEST</scope><scope>PQQKQ</scope><scope>PQUKI</scope><scope>PRINS</scope><scope>PTHSS</scope></search><sort><creationdate>20240314</creationdate><title>Faceptor: A Generalist Model for Face Perception</title><author>Qin, Lixiong ; Wang, Mei ; Liu, Xuannan ; Zhang, Yuhang ; Deng, Wei ; Song, Xiaoshuai ; Xu, Weiran ; Deng, Weihong</author></sort><facets><frbrtype>5</frbrtype><frbrgroupid>cdi_FETCH-proquest_journals_29576045783</frbrgroupid><rsrctype>articles</rsrctype><prefilter>articles</prefilter><language>eng</language><creationdate>2024</creationdate><topic>Chronology</topic><topic>Design standards</topic><topic>Extensibility</topic><topic>Face recognition</topic><topic>Machine learning</topic><topic>Perception</topic><topic>Semantics</topic><topic>Structural design</topic><topic>Supervised learning</topic><toplevel>online_resources</toplevel><creatorcontrib>Qin, Lixiong</creatorcontrib><creatorcontrib>Wang, Mei</creatorcontrib><creatorcontrib>Liu, Xuannan</creatorcontrib><creatorcontrib>Zhang, Yuhang</creatorcontrib><creatorcontrib>Deng, Wei</creatorcontrib><creatorcontrib>Song, Xiaoshuai</creatorcontrib><creatorcontrib>Xu, Weiran</creatorcontrib><creatorcontrib>Deng, Weihong</creatorcontrib><collection>ProQuest SciTech Collection</collection><collection>ProQuest Technology Collection</collection><collection>Materials Science & Engineering Collection</collection><collection>ProQuest Central (Alumni)</collection><collection>ProQuest Central</collection><collection>ProQuest Central Essentials</collection><collection>AUTh Library subscriptions: ProQuest Central</collection><collection>Technology Collection</collection><collection>ProQuest One Community College</collection><collection>ProQuest Central</collection><collection>SciTech Premium Collection</collection><collection>ProQuest Engineering Collection</collection><collection>ProQuest Engineering Database</collection><collection>Publicly Available Content (ProQuest)</collection><collection>ProQuest One Academic Eastern Edition (DO NOT USE)</collection><collection>ProQuest One Academic</collection><collection>ProQuest One Academic UKI Edition</collection><collection>ProQuest Central China</collection><collection>Engineering collection</collection></facets><delivery><delcategory>Remote Search Resource</delcategory><fulltext>fulltext</fulltext></delivery><addata><au>Qin, Lixiong</au><au>Wang, Mei</au><au>Liu, Xuannan</au><au>Zhang, Yuhang</au><au>Deng, Wei</au><au>Song, Xiaoshuai</au><au>Xu, Weiran</au><au>Deng, Weihong</au><format>book</format><genre>document</genre><ristype>GEN</ristype><atitle>Faceptor: A Generalist Model for Face Perception</atitle><jtitle>arXiv.org</jtitle><date>2024-03-14</date><risdate>2024</risdate><eissn>2331-8422</eissn><abstract>With the comprehensive research conducted on various face analysis tasks, there is a growing interest among researchers to develop a unified approach to face perception. Existing methods mainly discuss unified representation and training, which lack task extensibility and application efficiency. To tackle this issue, we focus on the unified model structure, exploring a face generalist model. As an intuitive design, Naive Faceptor enables tasks with the same output shape and granularity to share the structural design of the standardized output head, achieving improved task extensibility. Furthermore, Faceptor is proposed to adopt a well-designed single-encoder dual-decoder architecture, allowing task-specific queries to represent new-coming semantics. This design enhances the unification of model structure while improving application efficiency in terms of storage overhead. Additionally, we introduce Layer-Attention into Faceptor, enabling the model to adaptively select features from optimal layers to perform the desired tasks. Through joint training on 13 face perception datasets, Faceptor achieves exceptional performance in facial landmark localization, face parsing, age estimation, expression recognition, binary attribute classification, and face recognition, achieving or surpassing specialized methods in most tasks. Our training framework can also be applied to auxiliary supervised learning, significantly improving performance in data-sparse tasks such as age estimation and expression recognition. The code and models will be made publicly available at https://github.com/lxq1000/Faceptor.</abstract><cop>Ithaca</cop><pub>Cornell University Library, arXiv.org</pub><oa>free_for_read</oa></addata></record>
fulltext	fulltext
identifier	EISSN: 2331-8422
ispartof	arXiv.org, 2024-03
issn	2331-8422
language	eng
recordid	cdi_proquest_journals_2957604578
source	Publicly Available Content (ProQuest)
subjects	Chronology Design standards Extensibility Face recognition Machine learning Perception Semantics Structural design Supervised learning
title	Faceptor: A Generalist Model for Face Perception
url	http://sfxeu10.hosted.exlibrisgroup.com/loughborough?ctx_ver=Z39.88-2004&ctx_enc=info:ofi/enc:UTF-8&ctx_tim=2025-01-28T09%3A16%3A04IST&url_ver=Z39.88-2004&url_ctx_fmt=infofi/fmt:kev:mtx:ctx&rfr_id=info:sid/primo.exlibrisgroup.com:primo3-Article-proquest&rft_val_fmt=info:ofi/fmt:kev:mtx:book&rft.genre=document&rft.atitle=Faceptor:%20A%20Generalist%20Model%20for%20Face%20Perception&rft.jtitle=arXiv.org&rft.au=Qin,%20Lixiong&rft.date=2024-03-14&rft.eissn=2331-8422&rft_id=info:doi/&rft_dat=%3Cproquest%3E2957604578%3C/proquest%3E%3Cgrp_id%3Ecdi_FETCH-proquest_journals_29576045783%3C/grp_id%3E%3Coa%3E%3C/oa%3E%3Curl%3E%3C/url%3E&rft_id=info:oai/&rft_pqid=2957604578&rft_id=info:pmid/&rfr_iscdi=true