Loading…

Towards Fine-grained Human Pose Transfer with Detail Replenishing Network

Human pose transfer (HPT) is an emerging research topic with huge potential in fashion design, media production, online advertising and virtual reality. For these applications, the visual realism of fine-grained appearance details is crucial for production quality and user engagement. However, exist...

Full description

Saved in:
Bibliographic Details
Published in:arXiv.org 2021-05
Main Authors: Yang, Lingbo, Wang, Pan, Liu, Chang, Gao, Zhanning, Ren, Peiran, Zhang, Xinfeng, Wang, Shanshe, Ma, Siwei, Hua, Xiansheng, Gao, Wen
Format: Article
Language:English
Subjects:
Online Access:Get full text
Tags: Add Tag
No Tags, Be the first to tag this record!
cited_by
cites
container_end_page
container_issue
container_start_page
container_title arXiv.org
container_volume
creator Yang, Lingbo
Wang, Pan
Liu, Chang
Gao, Zhanning
Ren, Peiran
Zhang, Xinfeng
Wang, Shanshe
Ma, Siwei
Hua, Xiansheng
Gao, Wen
description Human pose transfer (HPT) is an emerging research topic with huge potential in fashion design, media production, online advertising and virtual reality. For these applications, the visual realism of fine-grained appearance details is crucial for production quality and user engagement. However, existing HPT methods often suffer from three fundamental issues: detail deficiency, content ambiguity and style inconsistency, which severely degrade the visual quality and realism of generated images. Aiming towards real-world applications, we develop a more challenging yet practical HPT setting, termed as Fine-grained Human Pose Transfer (FHPT), with a higher focus on semantic fidelity and detail replenishment. Concretely, we analyze the potential design flaws of existing methods via an illustrative example, and establish the core FHPT methodology by combing the idea of content synthesis and feature transfer together in a mutually-guided fashion. Thereafter, we substantiate the proposed methodology with a Detail Replenishing Network (DRN) and a corresponding coarse-to-fine model training scheme. Moreover, we build up a complete suite of fine-grained evaluation protocols to address the challenges of FHPT in a comprehensive manner, including semantic analysis, structural detection and perceptual quality assessment. Extensive experiments on the DeepFashion benchmark dataset have verified the power of proposed benchmark against start-of-the-art works, with 12\%-14\% gain on top-10 retrieval recall, 5\% higher joint localization accuracy, and near 40\% gain on face identity preservation. Moreover, the evaluation results offer further insights to the subject matter, which could inspire many promising future works along this direction.
doi_str_mv 10.48550/arxiv.2005.12494
format article
fullrecord <record><control><sourceid>proquest</sourceid><recordid>TN_cdi_proquest_journals_2407150021</recordid><sourceformat>XML</sourceformat><sourcesystem>PC</sourcesystem><sourcerecordid>2407150021</sourcerecordid><originalsourceid>FETCH-LOGICAL-a521-67e4ee4a09cb48de1ca75cb84d911e6868fbeda4c2c40ae65808b1aaaa41ecd03</originalsourceid><addsrcrecordid>eNotj1FLwzAURoMgOOZ-gG8Bn1tv0qRNH2U6Nxgq0vdxm95umTWdSev8-SvoeTlv5-Nj7E5AqozW8IDh1_2kEkCnQqpSXbGZzDKRGCXlDVvEeAQAmRdS62zGNlV_xtBEvnKekn3ASQ1fj1_o-XsfiVcBfWwp8LMbDvyJBnQd_6BTR97Fg_N7_krDuQ-ft-y6xS7S4t9zVq2eq-U62b69bJaP2wS1FElekCJSCKWtlWlIWCy0rY1qSiEoN7lpa2pQWWkVIOXagKkFTihBtoFszu7_sqfQf48Uh92xH4OfFndSQSH0dE5kF7C0Tuo</addsrcrecordid><sourcetype>Aggregation Database</sourcetype><iscdi>true</iscdi><recordtype>article</recordtype><pqid>2407150021</pqid></control><display><type>article</type><title>Towards Fine-grained Human Pose Transfer with Detail Replenishing Network</title><source>Publicly Available Content Database</source><creator>Yang, Lingbo ; Wang, Pan ; Liu, Chang ; Gao, Zhanning ; Ren, Peiran ; Zhang, Xinfeng ; Wang, Shanshe ; Ma, Siwei ; Hua, Xiansheng ; Gao, Wen</creator><creatorcontrib>Yang, Lingbo ; Wang, Pan ; Liu, Chang ; Gao, Zhanning ; Ren, Peiran ; Zhang, Xinfeng ; Wang, Shanshe ; Ma, Siwei ; Hua, Xiansheng ; Gao, Wen</creatorcontrib><description>Human pose transfer (HPT) is an emerging research topic with huge potential in fashion design, media production, online advertising and virtual reality. For these applications, the visual realism of fine-grained appearance details is crucial for production quality and user engagement. However, existing HPT methods often suffer from three fundamental issues: detail deficiency, content ambiguity and style inconsistency, which severely degrade the visual quality and realism of generated images. Aiming towards real-world applications, we develop a more challenging yet practical HPT setting, termed as Fine-grained Human Pose Transfer (FHPT), with a higher focus on semantic fidelity and detail replenishment. Concretely, we analyze the potential design flaws of existing methods via an illustrative example, and establish the core FHPT methodology by combing the idea of content synthesis and feature transfer together in a mutually-guided fashion. Thereafter, we substantiate the proposed methodology with a Detail Replenishing Network (DRN) and a corresponding coarse-to-fine model training scheme. Moreover, we build up a complete suite of fine-grained evaluation protocols to address the challenges of FHPT in a comprehensive manner, including semantic analysis, structural detection and perceptual quality assessment. Extensive experiments on the DeepFashion benchmark dataset have verified the power of proposed benchmark against start-of-the-art works, with 12\%-14\% gain on top-10 retrieval recall, 5\% higher joint localization accuracy, and near 40\% gain on face identity preservation. Moreover, the evaluation results offer further insights to the subject matter, which could inspire many promising future works along this direction.</description><identifier>EISSN: 2331-8422</identifier><identifier>DOI: 10.48550/arxiv.2005.12494</identifier><language>eng</language><publisher>Ithaca: Cornell University Library, arXiv.org</publisher><subject>Art works ; Benchmarks ; Evaluation ; Flaw detection ; Image quality ; Protocol (computers) ; Quality assessment ; Realism ; Replenishment ; Semantics ; Virtual reality</subject><ispartof>arXiv.org, 2021-05</ispartof><rights>2021. This work is published under http://creativecommons.org/licenses/by-nc-nd/4.0/ (the “License”). Notwithstanding the ProQuest Terms and Conditions, you may use this content in accordance with the terms of the License.</rights><oa>free_for_read</oa><woscitedreferencessubscribed>false</woscitedreferencessubscribed></display><links><openurl>$$Topenurl_article</openurl><openurlfulltext>$$Topenurlfull_article</openurlfulltext><thumbnail>$$Tsyndetics_thumb_exl</thumbnail><linktohtml>$$Uhttps://www.proquest.com/docview/2407150021?pq-origsite=primo$$EHTML$$P50$$Gproquest$$Hfree_for_read</linktohtml><link.rule.ids>780,784,25753,27925,37012,44590</link.rule.ids></links><search><creatorcontrib>Yang, Lingbo</creatorcontrib><creatorcontrib>Wang, Pan</creatorcontrib><creatorcontrib>Liu, Chang</creatorcontrib><creatorcontrib>Gao, Zhanning</creatorcontrib><creatorcontrib>Ren, Peiran</creatorcontrib><creatorcontrib>Zhang, Xinfeng</creatorcontrib><creatorcontrib>Wang, Shanshe</creatorcontrib><creatorcontrib>Ma, Siwei</creatorcontrib><creatorcontrib>Hua, Xiansheng</creatorcontrib><creatorcontrib>Gao, Wen</creatorcontrib><title>Towards Fine-grained Human Pose Transfer with Detail Replenishing Network</title><title>arXiv.org</title><description>Human pose transfer (HPT) is an emerging research topic with huge potential in fashion design, media production, online advertising and virtual reality. For these applications, the visual realism of fine-grained appearance details is crucial for production quality and user engagement. However, existing HPT methods often suffer from three fundamental issues: detail deficiency, content ambiguity and style inconsistency, which severely degrade the visual quality and realism of generated images. Aiming towards real-world applications, we develop a more challenging yet practical HPT setting, termed as Fine-grained Human Pose Transfer (FHPT), with a higher focus on semantic fidelity and detail replenishment. Concretely, we analyze the potential design flaws of existing methods via an illustrative example, and establish the core FHPT methodology by combing the idea of content synthesis and feature transfer together in a mutually-guided fashion. Thereafter, we substantiate the proposed methodology with a Detail Replenishing Network (DRN) and a corresponding coarse-to-fine model training scheme. Moreover, we build up a complete suite of fine-grained evaluation protocols to address the challenges of FHPT in a comprehensive manner, including semantic analysis, structural detection and perceptual quality assessment. Extensive experiments on the DeepFashion benchmark dataset have verified the power of proposed benchmark against start-of-the-art works, with 12\%-14\% gain on top-10 retrieval recall, 5\% higher joint localization accuracy, and near 40\% gain on face identity preservation. Moreover, the evaluation results offer further insights to the subject matter, which could inspire many promising future works along this direction.</description><subject>Art works</subject><subject>Benchmarks</subject><subject>Evaluation</subject><subject>Flaw detection</subject><subject>Image quality</subject><subject>Protocol (computers)</subject><subject>Quality assessment</subject><subject>Realism</subject><subject>Replenishment</subject><subject>Semantics</subject><subject>Virtual reality</subject><issn>2331-8422</issn><fulltext>true</fulltext><rsrctype>article</rsrctype><creationdate>2021</creationdate><recordtype>article</recordtype><sourceid>PIMPY</sourceid><recordid>eNotj1FLwzAURoMgOOZ-gG8Bn1tv0qRNH2U6Nxgq0vdxm95umTWdSev8-SvoeTlv5-Nj7E5AqozW8IDh1_2kEkCnQqpSXbGZzDKRGCXlDVvEeAQAmRdS62zGNlV_xtBEvnKekn3ASQ1fj1_o-XsfiVcBfWwp8LMbDvyJBnQd_6BTR97Fg_N7_krDuQ-ft-y6xS7S4t9zVq2eq-U62b69bJaP2wS1FElekCJSCKWtlWlIWCy0rY1qSiEoN7lpa2pQWWkVIOXagKkFTihBtoFszu7_sqfQf48Uh92xH4OfFndSQSH0dE5kF7C0Tuo</recordid><startdate>20210507</startdate><enddate>20210507</enddate><creator>Yang, Lingbo</creator><creator>Wang, Pan</creator><creator>Liu, Chang</creator><creator>Gao, Zhanning</creator><creator>Ren, Peiran</creator><creator>Zhang, Xinfeng</creator><creator>Wang, Shanshe</creator><creator>Ma, Siwei</creator><creator>Hua, Xiansheng</creator><creator>Gao, Wen</creator><general>Cornell University Library, arXiv.org</general><scope>8FE</scope><scope>8FG</scope><scope>ABJCF</scope><scope>ABUWG</scope><scope>AFKRA</scope><scope>AZQEC</scope><scope>BENPR</scope><scope>BGLVJ</scope><scope>CCPQU</scope><scope>DWQXO</scope><scope>HCIFZ</scope><scope>L6V</scope><scope>M7S</scope><scope>PIMPY</scope><scope>PQEST</scope><scope>PQQKQ</scope><scope>PQUKI</scope><scope>PRINS</scope><scope>PTHSS</scope></search><sort><creationdate>20210507</creationdate><title>Towards Fine-grained Human Pose Transfer with Detail Replenishing Network</title><author>Yang, Lingbo ; Wang, Pan ; Liu, Chang ; Gao, Zhanning ; Ren, Peiran ; Zhang, Xinfeng ; Wang, Shanshe ; Ma, Siwei ; Hua, Xiansheng ; Gao, Wen</author></sort><facets><frbrtype>5</frbrtype><frbrgroupid>cdi_FETCH-LOGICAL-a521-67e4ee4a09cb48de1ca75cb84d911e6868fbeda4c2c40ae65808b1aaaa41ecd03</frbrgroupid><rsrctype>articles</rsrctype><prefilter>articles</prefilter><language>eng</language><creationdate>2021</creationdate><topic>Art works</topic><topic>Benchmarks</topic><topic>Evaluation</topic><topic>Flaw detection</topic><topic>Image quality</topic><topic>Protocol (computers)</topic><topic>Quality assessment</topic><topic>Realism</topic><topic>Replenishment</topic><topic>Semantics</topic><topic>Virtual reality</topic><toplevel>online_resources</toplevel><creatorcontrib>Yang, Lingbo</creatorcontrib><creatorcontrib>Wang, Pan</creatorcontrib><creatorcontrib>Liu, Chang</creatorcontrib><creatorcontrib>Gao, Zhanning</creatorcontrib><creatorcontrib>Ren, Peiran</creatorcontrib><creatorcontrib>Zhang, Xinfeng</creatorcontrib><creatorcontrib>Wang, Shanshe</creatorcontrib><creatorcontrib>Ma, Siwei</creatorcontrib><creatorcontrib>Hua, Xiansheng</creatorcontrib><creatorcontrib>Gao, Wen</creatorcontrib><collection>ProQuest SciTech Collection</collection><collection>ProQuest Technology Collection</collection><collection>Materials Science &amp; Engineering Collection</collection><collection>ProQuest Central (Alumni Edition)</collection><collection>ProQuest Central</collection><collection>ProQuest Central Essentials</collection><collection>ProQuest Central</collection><collection>Technology Collection</collection><collection>ProQuest One Community College</collection><collection>ProQuest Central</collection><collection>SciTech Premium Collection</collection><collection>ProQuest Engineering Collection</collection><collection>Engineering Database</collection><collection>Publicly Available Content Database</collection><collection>ProQuest One Academic Eastern Edition (DO NOT USE)</collection><collection>ProQuest One Academic</collection><collection>ProQuest One Academic UKI Edition</collection><collection>ProQuest Central China</collection><collection>Engineering collection</collection><jtitle>arXiv.org</jtitle></facets><delivery><delcategory>Remote Search Resource</delcategory><fulltext>fulltext</fulltext></delivery><addata><au>Yang, Lingbo</au><au>Wang, Pan</au><au>Liu, Chang</au><au>Gao, Zhanning</au><au>Ren, Peiran</au><au>Zhang, Xinfeng</au><au>Wang, Shanshe</au><au>Ma, Siwei</au><au>Hua, Xiansheng</au><au>Gao, Wen</au><format>journal</format><genre>article</genre><ristype>JOUR</ristype><atitle>Towards Fine-grained Human Pose Transfer with Detail Replenishing Network</atitle><jtitle>arXiv.org</jtitle><date>2021-05-07</date><risdate>2021</risdate><eissn>2331-8422</eissn><abstract>Human pose transfer (HPT) is an emerging research topic with huge potential in fashion design, media production, online advertising and virtual reality. For these applications, the visual realism of fine-grained appearance details is crucial for production quality and user engagement. However, existing HPT methods often suffer from three fundamental issues: detail deficiency, content ambiguity and style inconsistency, which severely degrade the visual quality and realism of generated images. Aiming towards real-world applications, we develop a more challenging yet practical HPT setting, termed as Fine-grained Human Pose Transfer (FHPT), with a higher focus on semantic fidelity and detail replenishment. Concretely, we analyze the potential design flaws of existing methods via an illustrative example, and establish the core FHPT methodology by combing the idea of content synthesis and feature transfer together in a mutually-guided fashion. Thereafter, we substantiate the proposed methodology with a Detail Replenishing Network (DRN) and a corresponding coarse-to-fine model training scheme. Moreover, we build up a complete suite of fine-grained evaluation protocols to address the challenges of FHPT in a comprehensive manner, including semantic analysis, structural detection and perceptual quality assessment. Extensive experiments on the DeepFashion benchmark dataset have verified the power of proposed benchmark against start-of-the-art works, with 12\%-14\% gain on top-10 retrieval recall, 5\% higher joint localization accuracy, and near 40\% gain on face identity preservation. Moreover, the evaluation results offer further insights to the subject matter, which could inspire many promising future works along this direction.</abstract><cop>Ithaca</cop><pub>Cornell University Library, arXiv.org</pub><doi>10.48550/arxiv.2005.12494</doi><oa>free_for_read</oa></addata></record>
fulltext fulltext
identifier EISSN: 2331-8422
ispartof arXiv.org, 2021-05
issn 2331-8422
language eng
recordid cdi_proquest_journals_2407150021
source Publicly Available Content Database
subjects Art works
Benchmarks
Evaluation
Flaw detection
Image quality
Protocol (computers)
Quality assessment
Realism
Replenishment
Semantics
Virtual reality
title Towards Fine-grained Human Pose Transfer with Detail Replenishing Network
url http://sfxeu10.hosted.exlibrisgroup.com/loughborough?ctx_ver=Z39.88-2004&ctx_enc=info:ofi/enc:UTF-8&ctx_tim=2024-12-20T09%3A02%3A14IST&url_ver=Z39.88-2004&url_ctx_fmt=infofi/fmt:kev:mtx:ctx&rfr_id=info:sid/primo.exlibrisgroup.com:primo3-Article-proquest&rft_val_fmt=info:ofi/fmt:kev:mtx:journal&rft.genre=article&rft.atitle=Towards%20Fine-grained%20Human%20Pose%20Transfer%20with%20Detail%20Replenishing%20Network&rft.jtitle=arXiv.org&rft.au=Yang,%20Lingbo&rft.date=2021-05-07&rft.eissn=2331-8422&rft_id=info:doi/10.48550/arxiv.2005.12494&rft_dat=%3Cproquest%3E2407150021%3C/proquest%3E%3Cgrp_id%3Ecdi_FETCH-LOGICAL-a521-67e4ee4a09cb48de1ca75cb84d911e6868fbeda4c2c40ae65808b1aaaa41ecd03%3C/grp_id%3E%3Coa%3E%3C/oa%3E%3Curl%3E%3C/url%3E&rft_id=info:oai/&rft_pqid=2407150021&rft_id=info:pmid/&rfr_iscdi=true