Loading…

Human motion reconstruction using deep transformer networks

•An attention-based deep neural network for natural human motion reconstruction.•An effective model for producing realistic human motion from very few constraints.•A deep neural network outperforming previous RNN-based methods and the Transformer.•The best feature combination for faithful human moti...

Full description

Saved in:

Bibliographic Details
Published in:	Pattern recognition letters 2021-10, Vol.150, p.162-169
Main Authors:	Kim, Seong Uk, Jang, Hanyoung, Im, Hyeonseung, Kim, Jongmin
Format:	Article
Language:	English
Subjects:	Action sequence generation Deep learning Feature extraction Human motion Ill posed problems Motion capture Reconstruction Sensor
Citations:	Items that this one cites Items that cite this one
Online Access:	Get full text
Tags:	Add Tag No Tags, Be the first to tag this record!

Description
Summary:	•An attention-based deep neural network for natural human motion reconstruction.•An effective model for producing realistic human motion from very few constraints.•A deep neural network outperforming previous RNN-based methods and the Transformer.•The best feature combination for faithful human motion reconstruction. Establishing a human motion reconstruction system from very few constraints imposed on the body has been an interesting and important research topic because it significantly reduces the degrees of freedom to be managed. However, it is a well-known mathematically ill-posed problem as the dimension of constraints is much lower than that of the human pose to be determined. Therefore, it is challenging to directly reconstruct the whole body joint information from very few constraints due to many possible solutions. To address this issue, we present a novel deep learning framework with an attention mechanism using large-scale motion capture (mocap) data for mapping very few user-defined constraints into the human motion as realistically as possible. Our system is built upon the attention networks for looking back further to achieve better results. Experimental results show that our network model is capable of producing more accurate results compared with previous approaches. We also conducted several experiments to test all possible combinations of the features extracted from the mocap data, and found the best feature combination to generate high-quality poses.
ISSN:	0167-8655 1872-7344
DOI:	10.1016/j.patrec.2021.06.018