Loading…

Learning Dynamic Tetrahedra for High-Quality Talking Head Synthesis

Recent works in implicit representations, such as Neural Radiance Fields (NeRF), have advanced the generation of realistic and animatable head avatars from video sequences. These implicit methods are still confronted by visual artifacts and jitters, since the lack of explicit geometric constraints p...

Full description

Saved in:

Bibliographic Details
Published in:	arXiv.org 2024-02
Main Authors:	Zhang, Zicheng, Zheng, Ruobing, Liu, Ziwen, Han, Congying, Li, Tianqi, Wang, Meng, Guo, Tiande, Chen, Jingdong, Li, Bonan, Yang, Ming
Format:	Article
Language:	English
Subjects:	Avatars Deformation Geometric constraints Implicit methods Learning Neural networks Synchronism Synthesis Tetrahedra Texture Three dimensional models Time synchronization Topology Training
Online Access:	Get full text
Tags:	Add Tag No Tags, Be the first to tag this record!

cited_by
cites
container_end_page
container_issue
container_start_page
container_title	arXiv.org
container_volume
creator	Zhang, Zicheng Zheng, Ruobing Liu, Ziwen Han, Congying Li, Tianqi Wang, Meng Guo, Tiande Chen, Jingdong Li, Bonan Yang, Ming
description	Recent works in implicit representations, such as Neural Radiance Fields (NeRF), have advanced the generation of realistic and animatable head avatars from video sequences. These implicit methods are still confronted by visual artifacts and jitters, since the lack of explicit geometric constraints poses a fundamental challenge in accurately modeling complex facial deformations. In this paper, we introduce Dynamic Tetrahedra (DynTet), a novel hybrid representation that encodes explicit dynamic meshes by neural networks to ensure geometric consistency across various motions and viewpoints. DynTet is parameterized by the coordinate-based networks which learn signed distance, deformation, and material texture, anchoring the training data into a predefined tetrahedra grid. Leveraging Marching Tetrahedra, DynTet efficiently decodes textured meshes with a consistent topology, enabling fast rendering through a differentiable rasterizer and supervision via a pixel loss. To enhance training efficiency, we incorporate classical 3D Morphable Models to facilitate geometry learning and define a canonical space for simplifying texture learning. These advantages are readily achievable owing to the effective geometric representation employed in DynTet. Compared with prior works, DynTet demonstrates significant improvements in fidelity, lip synchronization, and real-time performance according to various metrics. Beyond producing stable and visually appealing synthesis videos, our method also outputs the dynamic meshes which is promising to enable many emerging applications.
format	article
fullrecord	<record><control><sourceid>proquest</sourceid><recordid>TN_cdi_proquest_journals_2932621137</recordid><sourceformat>XML</sourceformat><sourcesystem>PC</sourcesystem><sourcerecordid>2932621137</sourcerecordid><originalsourceid>FETCH-proquest_journals_29326211373</originalsourceid><addsrcrecordid>eNqNyrEOgjAUQNHGxESi_EMTZxJ4FdAZNQwuRnbyAg9arEXbMvD3auIHON3hngULQIgk2u8AVix0bojjGLIc0lQErLgQWqNMz4-zwYdqeEXeoqTWIu9Gy0vVy-g6oVZ-5hXq-9eWhC2_zcZLcspt2LJD7Sj8dc2251NVlNHTjq-JnK-HcbLms2o4CMggSUQu_lNvzfM5zA</addsrcrecordid><sourcetype>Aggregation Database</sourcetype><iscdi>true</iscdi><recordtype>article</recordtype><pqid>2932621137</pqid></control><display><type>article</type><title>Learning Dynamic Tetrahedra for High-Quality Talking Head Synthesis</title><source>Publicly Available Content (ProQuest)</source><creator>Zhang, Zicheng ; Zheng, Ruobing ; Liu, Ziwen ; Han, Congying ; Li, Tianqi ; Wang, Meng ; Guo, Tiande ; Chen, Jingdong ; Li, Bonan ; Yang, Ming</creator><creatorcontrib>Zhang, Zicheng ; Zheng, Ruobing ; Liu, Ziwen ; Han, Congying ; Li, Tianqi ; Wang, Meng ; Guo, Tiande ; Chen, Jingdong ; Li, Bonan ; Yang, Ming</creatorcontrib><description>Recent works in implicit representations, such as Neural Radiance Fields (NeRF), have advanced the generation of realistic and animatable head avatars from video sequences. These implicit methods are still confronted by visual artifacts and jitters, since the lack of explicit geometric constraints poses a fundamental challenge in accurately modeling complex facial deformations. In this paper, we introduce Dynamic Tetrahedra (DynTet), a novel hybrid representation that encodes explicit dynamic meshes by neural networks to ensure geometric consistency across various motions and viewpoints. DynTet is parameterized by the coordinate-based networks which learn signed distance, deformation, and material texture, anchoring the training data into a predefined tetrahedra grid. Leveraging Marching Tetrahedra, DynTet efficiently decodes textured meshes with a consistent topology, enabling fast rendering through a differentiable rasterizer and supervision via a pixel loss. To enhance training efficiency, we incorporate classical 3D Morphable Models to facilitate geometry learning and define a canonical space for simplifying texture learning. These advantages are readily achievable owing to the effective geometric representation employed in DynTet. Compared with prior works, DynTet demonstrates significant improvements in fidelity, lip synchronization, and real-time performance according to various metrics. Beyond producing stable and visually appealing synthesis videos, our method also outputs the dynamic meshes which is promising to enable many emerging applications.</description><identifier>EISSN: 2331-8422</identifier><language>eng</language><publisher>Ithaca: Cornell University Library, arXiv.org</publisher><subject>Avatars ; Deformation ; Geometric constraints ; Implicit methods ; Learning ; Neural networks ; Synchronism ; Synthesis ; Tetrahedra ; Texture ; Three dimensional models ; Time synchronization ; Topology ; Training</subject><ispartof>arXiv.org, 2024-02</ispartof><rights>2024. This work is published under http://arxiv.org/licenses/nonexclusive-distrib/1.0/ (the “License”). Notwithstanding the ProQuest Terms and Conditions, you may use this content in accordance with the terms of the License.</rights><oa>free_for_read</oa><woscitedreferencessubscribed>false</woscitedreferencessubscribed></display><links><openurl>$$Topenurl_article</openurl><openurlfulltext>$$Topenurlfull_article</openurlfulltext><thumbnail>$$Tsyndetics_thumb_exl</thumbnail><linktohtml>$$Uhttps://www.proquest.com/docview/2932621137?pq-origsite=primo$$EHTML$$P50$$Gproquest$$Hfree_for_read</linktohtml><link.rule.ids>776,780,25730,36988,44565</link.rule.ids></links><search><creatorcontrib>Zhang, Zicheng</creatorcontrib><creatorcontrib>Zheng, Ruobing</creatorcontrib><creatorcontrib>Liu, Ziwen</creatorcontrib><creatorcontrib>Han, Congying</creatorcontrib><creatorcontrib>Li, Tianqi</creatorcontrib><creatorcontrib>Wang, Meng</creatorcontrib><creatorcontrib>Guo, Tiande</creatorcontrib><creatorcontrib>Chen, Jingdong</creatorcontrib><creatorcontrib>Li, Bonan</creatorcontrib><creatorcontrib>Yang, Ming</creatorcontrib><title>Learning Dynamic Tetrahedra for High-Quality Talking Head Synthesis</title><title>arXiv.org</title><description>Recent works in implicit representations, such as Neural Radiance Fields (NeRF), have advanced the generation of realistic and animatable head avatars from video sequences. These implicit methods are still confronted by visual artifacts and jitters, since the lack of explicit geometric constraints poses a fundamental challenge in accurately modeling complex facial deformations. In this paper, we introduce Dynamic Tetrahedra (DynTet), a novel hybrid representation that encodes explicit dynamic meshes by neural networks to ensure geometric consistency across various motions and viewpoints. DynTet is parameterized by the coordinate-based networks which learn signed distance, deformation, and material texture, anchoring the training data into a predefined tetrahedra grid. Leveraging Marching Tetrahedra, DynTet efficiently decodes textured meshes with a consistent topology, enabling fast rendering through a differentiable rasterizer and supervision via a pixel loss. To enhance training efficiency, we incorporate classical 3D Morphable Models to facilitate geometry learning and define a canonical space for simplifying texture learning. These advantages are readily achievable owing to the effective geometric representation employed in DynTet. Compared with prior works, DynTet demonstrates significant improvements in fidelity, lip synchronization, and real-time performance according to various metrics. Beyond producing stable and visually appealing synthesis videos, our method also outputs the dynamic meshes which is promising to enable many emerging applications.</description><subject>Avatars</subject><subject>Deformation</subject><subject>Geometric constraints</subject><subject>Implicit methods</subject><subject>Learning</subject><subject>Neural networks</subject><subject>Synchronism</subject><subject>Synthesis</subject><subject>Tetrahedra</subject><subject>Texture</subject><subject>Three dimensional models</subject><subject>Time synchronization</subject><subject>Topology</subject><subject>Training</subject><issn>2331-8422</issn><fulltext>true</fulltext><rsrctype>article</rsrctype><creationdate>2024</creationdate><recordtype>article</recordtype><sourceid>PIMPY</sourceid><recordid>eNqNyrEOgjAUQNHGxESi_EMTZxJ4FdAZNQwuRnbyAg9arEXbMvD3auIHON3hngULQIgk2u8AVix0bojjGLIc0lQErLgQWqNMz4-zwYdqeEXeoqTWIu9Gy0vVy-g6oVZ-5hXq-9eWhC2_zcZLcspt2LJD7Sj8dc2251NVlNHTjq-JnK-HcbLms2o4CMggSUQu_lNvzfM5zA</recordid><startdate>20240227</startdate><enddate>20240227</enddate><creator>Zhang, Zicheng</creator><creator>Zheng, Ruobing</creator><creator>Liu, Ziwen</creator><creator>Han, Congying</creator><creator>Li, Tianqi</creator><creator>Wang, Meng</creator><creator>Guo, Tiande</creator><creator>Chen, Jingdong</creator><creator>Li, Bonan</creator><creator>Yang, Ming</creator><general>Cornell University Library, arXiv.org</general><scope>8FE</scope><scope>8FG</scope><scope>ABJCF</scope><scope>ABUWG</scope><scope>AFKRA</scope><scope>AZQEC</scope><scope>BENPR</scope><scope>BGLVJ</scope><scope>CCPQU</scope><scope>DWQXO</scope><scope>HCIFZ</scope><scope>L6V</scope><scope>M7S</scope><scope>PHGZM</scope><scope>PHGZT</scope><scope>PIMPY</scope><scope>PKEHL</scope><scope>PQEST</scope><scope>PQGLB</scope><scope>PQQKQ</scope><scope>PQUKI</scope><scope>PRINS</scope><scope>PTHSS</scope></search><sort><creationdate>20240227</creationdate><title>Learning Dynamic Tetrahedra for High-Quality Talking Head Synthesis</title><author>Zhang, Zicheng ; Zheng, Ruobing ; Liu, Ziwen ; Han, Congying ; Li, Tianqi ; Wang, Meng ; Guo, Tiande ; Chen, Jingdong ; Li, Bonan ; Yang, Ming</author></sort><facets><frbrtype>5</frbrtype><frbrgroupid>cdi_FETCH-proquest_journals_29326211373</frbrgroupid><rsrctype>articles</rsrctype><prefilter>articles</prefilter><language>eng</language><creationdate>2024</creationdate><topic>Avatars</topic><topic>Deformation</topic><topic>Geometric constraints</topic><topic>Implicit methods</topic><topic>Learning</topic><topic>Neural networks</topic><topic>Synchronism</topic><topic>Synthesis</topic><topic>Tetrahedra</topic><topic>Texture</topic><topic>Three dimensional models</topic><topic>Time synchronization</topic><topic>Topology</topic><topic>Training</topic><toplevel>online_resources</toplevel><creatorcontrib>Zhang, Zicheng</creatorcontrib><creatorcontrib>Zheng, Ruobing</creatorcontrib><creatorcontrib>Liu, Ziwen</creatorcontrib><creatorcontrib>Han, Congying</creatorcontrib><creatorcontrib>Li, Tianqi</creatorcontrib><creatorcontrib>Wang, Meng</creatorcontrib><creatorcontrib>Guo, Tiande</creatorcontrib><creatorcontrib>Chen, Jingdong</creatorcontrib><creatorcontrib>Li, Bonan</creatorcontrib><creatorcontrib>Yang, Ming</creatorcontrib><collection>ProQuest SciTech Collection</collection><collection>ProQuest Technology Collection</collection><collection>Materials Science & Engineering Collection</collection><collection>ProQuest Central (Alumni)</collection><collection>ProQuest Central UK/Ireland</collection><collection>ProQuest Central Essentials</collection><collection>AUTh Library subscriptions: ProQuest Central</collection><collection>Technology Collection</collection><collection>ProQuest One Community College</collection><collection>ProQuest Central</collection><collection>SciTech Premium Collection (Proquest) (PQ_SDU_P3)</collection><collection>ProQuest Engineering Collection</collection><collection>ProQuest Engineering Database</collection><collection>ProQuest Central (New)</collection><collection>ProQuest One Academic (New)</collection><collection>Publicly Available Content (ProQuest)</collection><collection>ProQuest One Academic Middle East (New)</collection><collection>ProQuest One Academic Eastern Edition (DO NOT USE)</collection><collection>ProQuest One Applied & Life Sciences</collection><collection>ProQuest One Academic</collection><collection>ProQuest One Academic UKI Edition</collection><collection>ProQuest Central China</collection><collection>Engineering collection</collection></facets><delivery><delcategory>Remote Search Resource</delcategory><fulltext>fulltext</fulltext></delivery><addata><au>Zhang, Zicheng</au><au>Zheng, Ruobing</au><au>Liu, Ziwen</au><au>Han, Congying</au><au>Li, Tianqi</au><au>Wang, Meng</au><au>Guo, Tiande</au><au>Chen, Jingdong</au><au>Li, Bonan</au><au>Yang, Ming</au><format>book</format><genre>document</genre><ristype>GEN</ristype><atitle>Learning Dynamic Tetrahedra for High-Quality Talking Head Synthesis</atitle><jtitle>arXiv.org</jtitle><date>2024-02-27</date><risdate>2024</risdate><eissn>2331-8422</eissn><abstract>Recent works in implicit representations, such as Neural Radiance Fields (NeRF), have advanced the generation of realistic and animatable head avatars from video sequences. These implicit methods are still confronted by visual artifacts and jitters, since the lack of explicit geometric constraints poses a fundamental challenge in accurately modeling complex facial deformations. In this paper, we introduce Dynamic Tetrahedra (DynTet), a novel hybrid representation that encodes explicit dynamic meshes by neural networks to ensure geometric consistency across various motions and viewpoints. DynTet is parameterized by the coordinate-based networks which learn signed distance, deformation, and material texture, anchoring the training data into a predefined tetrahedra grid. Leveraging Marching Tetrahedra, DynTet efficiently decodes textured meshes with a consistent topology, enabling fast rendering through a differentiable rasterizer and supervision via a pixel loss. To enhance training efficiency, we incorporate classical 3D Morphable Models to facilitate geometry learning and define a canonical space for simplifying texture learning. These advantages are readily achievable owing to the effective geometric representation employed in DynTet. Compared with prior works, DynTet demonstrates significant improvements in fidelity, lip synchronization, and real-time performance according to various metrics. Beyond producing stable and visually appealing synthesis videos, our method also outputs the dynamic meshes which is promising to enable many emerging applications.</abstract><cop>Ithaca</cop><pub>Cornell University Library, arXiv.org</pub><oa>free_for_read</oa></addata></record>
fulltext	fulltext
identifier	EISSN: 2331-8422
ispartof	arXiv.org, 2024-02
issn	2331-8422
language	eng
recordid	cdi_proquest_journals_2932621137
source	Publicly Available Content (ProQuest)
subjects	Avatars Deformation Geometric constraints Implicit methods Learning Neural networks Synchronism Synthesis Tetrahedra Texture Three dimensional models Time synchronization Topology Training
title	Learning Dynamic Tetrahedra for High-Quality Talking Head Synthesis
url	http://sfxeu10.hosted.exlibrisgroup.com/loughborough?ctx_ver=Z39.88-2004&ctx_enc=info:ofi/enc:UTF-8&ctx_tim=2025-02-24T20%3A47%3A31IST&url_ver=Z39.88-2004&url_ctx_fmt=infofi/fmt:kev:mtx:ctx&rfr_id=info:sid/primo.exlibrisgroup.com:primo3-Article-proquest&rft_val_fmt=info:ofi/fmt:kev:mtx:book&rft.genre=document&rft.atitle=Learning%20Dynamic%20Tetrahedra%20for%20High-Quality%20Talking%20Head%20Synthesis&rft.jtitle=arXiv.org&rft.au=Zhang,%20Zicheng&rft.date=2024-02-27&rft.eissn=2331-8422&rft_id=info:doi/&rft_dat=%3Cproquest%3E2932621137%3C/proquest%3E%3Cgrp_id%3Ecdi_FETCH-proquest_journals_29326211373%3C/grp_id%3E%3Coa%3E%3C/oa%3E%3Curl%3E%3C/url%3E&rft_id=info:oai/&rft_pqid=2932621137&rft_id=info:pmid/&rfr_iscdi=true