Loading…

Simple very deep convolutional network for robust hand pose regression from a single depth image

•A dedicate network structure is presented to regress 3D hand pose from a single depth image.•We discuss the effect of the ConvNet depth on its accuracy under the hand pose regression setting.•We introduce batch normalization and a low-dimensional embedding to help hand pose estimation.•The proposed...

Full description

Saved in:

Bibliographic Details
Published in:	Pattern recognition letters 2019-03, Vol.119, p.205-213
Main Authors:	Fan, Qing, Shen, Xukun, Hu, Yong, Yu, Changjian
Format:	Article
Language:	English
Subjects:	Artificial neural networks Batch normalization Deep convolutional network Depth image Embedding Hand pose estimation Low-dimensional embedding Post-production processing Robustness (mathematics)
Citations:	Items that this one cites Items that cite this one
Online Access:	Get full text
Tags:	Add Tag No Tags, Be the first to tag this record!

cited_by	cdi_FETCH-LOGICAL-c334t-6eb5037b6a0ced764c0128cdd432f7d89f4508b043dcc5a9c40c757e7f74ad683
cites	cdi_FETCH-LOGICAL-c334t-6eb5037b6a0ced764c0128cdd432f7d89f4508b043dcc5a9c40c757e7f74ad683
container_end_page	213
container_issue
container_start_page	205
container_title	Pattern recognition letters
container_volume	119
creator	Fan, Qing Shen, Xukun Hu, Yong Yu, Changjian
description	•A dedicate network structure is presented to regress 3D hand pose from a single depth image.•We discuss the effect of the ConvNet depth on its accuracy under the hand pose regression setting.•We introduce batch normalization and a low-dimensional embedding to help hand pose estimation.•The proposed system is efficient enough with more than 500 fps on a single GPU.•Experiments results show that our method can get competitive results to state-of-the-art methods. We propose a novel approach for articulated hand pose estimation from a single depth image using a very deep convolutional network. For the first, a very deep network structure is designed to directly maps a single depth image to its corresponding 3D hand joint locations. This approach eliminates the necessity of hand-crafted intermediate features and sophisticated post-processing stages for robust and accurate hand pose estimation. We use Batch Normalization to accelerate training and prevent the objective function from getting stuck in poor local minima. We introduce a low-dimensional embedding forcing the network to learn the inherent constraints of hand joints, which helps to reduce the cost of reconstructing 3D hand poses from high-dimension feature space. We discuss the effect of the convolutional network depth on its accuracy under the hand pose regression setting. Quantitative assessments on two challenging datasets show that our proposed method gets competitive results to state-of-the-art approaches in terms of accuracy. Moreover, qualitative results also show that our proposed method is robust to some difficult hand poses.
doi_str_mv	10.1016/j.patrec.2017.10.019
format	article
fullrecord	<record><control><sourceid>proquest_cross</sourceid><recordid>TN_cdi_proquest_journals_2196503543</recordid><sourceformat>XML</sourceformat><sourcesystem>PC</sourcesystem><els_id>S0167865517303872</els_id><sourcerecordid>2196503543</sourcerecordid><originalsourceid>FETCH-LOGICAL-c334t-6eb5037b6a0ced764c0128cdd432f7d89f4508b043dcc5a9c40c757e7f74ad683</originalsourceid><addsrcrecordid>eNp9kE9LxDAUxIMouK5-Aw8Bz61JkzbtRZDFf7DgQT3HNHndTe02NWlX9tubpZ49PRh-M8wbhK4pSSmhxW2bDmr0oNOMUBGllNDqBC1oKbJEMM5P0SJiIimLPD9HFyG0hJCCVeUCfb7Z3dAB3oM_YAMwYO36veum0bpedbiH8cf5L9w4j72rpzDireoNHlwA7GHjIYRI4sa7HVY42H4T0wwM4xbbndrAJTprVBfg6u8u0cfjw_vqOVm_Pr2s7teJZoyPSQF1TpioC0U0GFFwTWhWamM4yxphyqrhOSlrwpnROleV5kSLXIBoBFemKNkS3cy5g3ffE4RRtm7y8YUgM1oVMTznLFJ8prR3IXho5OBjTX-QlMjjlrKV85byuOVRjVtG291sg_jB3oKXQVvoY1Mb0VEaZ_8P-AX4SYBN</addsrcrecordid><sourcetype>Aggregation Database</sourcetype><iscdi>true</iscdi><recordtype>article</recordtype><pqid>2196503543</pqid></control><display><type>article</type><title>Simple very deep convolutional network for robust hand pose regression from a single depth image</title><source>ScienceDirect Freedom Collection</source><creator>Fan, Qing ; Shen, Xukun ; Hu, Yong ; Yu, Changjian</creator><creatorcontrib>Fan, Qing ; Shen, Xukun ; Hu, Yong ; Yu, Changjian</creatorcontrib><description>•A dedicate network structure is presented to regress 3D hand pose from a single depth image.•We discuss the effect of the ConvNet depth on its accuracy under the hand pose regression setting.•We introduce batch normalization and a low-dimensional embedding to help hand pose estimation.•The proposed system is efficient enough with more than 500 fps on a single GPU.•Experiments results show that our method can get competitive results to state-of-the-art methods. We propose a novel approach for articulated hand pose estimation from a single depth image using a very deep convolutional network. For the first, a very deep network structure is designed to directly maps a single depth image to its corresponding 3D hand joint locations. This approach eliminates the necessity of hand-crafted intermediate features and sophisticated post-processing stages for robust and accurate hand pose estimation. We use Batch Normalization to accelerate training and prevent the objective function from getting stuck in poor local minima. We introduce a low-dimensional embedding forcing the network to learn the inherent constraints of hand joints, which helps to reduce the cost of reconstructing 3D hand poses from high-dimension feature space. We discuss the effect of the convolutional network depth on its accuracy under the hand pose regression setting. Quantitative assessments on two challenging datasets show that our proposed method gets competitive results to state-of-the-art approaches in terms of accuracy. Moreover, qualitative results also show that our proposed method is robust to some difficult hand poses.</description><identifier>ISSN: 0167-8655</identifier><identifier>EISSN: 1872-7344</identifier><identifier>DOI: 10.1016/j.patrec.2017.10.019</identifier><language>eng</language><publisher>Amsterdam: Elsevier B.V</publisher><subject>Artificial neural networks ; Batch normalization ; Deep convolutional network ; Depth image ; Embedding ; Hand pose estimation ; Low-dimensional embedding ; Post-production processing ; Robustness (mathematics)</subject><ispartof>Pattern recognition letters, 2019-03, Vol.119, p.205-213</ispartof><rights>2017 Elsevier B.V.</rights><rights>Copyright Elsevier Science Ltd. Mar 1, 2019</rights><lds50>peer_reviewed</lds50><woscitedreferencessubscribed>false</woscitedreferencessubscribed><citedby>FETCH-LOGICAL-c334t-6eb5037b6a0ced764c0128cdd432f7d89f4508b043dcc5a9c40c757e7f74ad683</citedby><cites>FETCH-LOGICAL-c334t-6eb5037b6a0ced764c0128cdd432f7d89f4508b043dcc5a9c40c757e7f74ad683</cites></display><links><openurl>$$Topenurl_article</openurl><openurlfulltext>$$Topenurlfull_article</openurlfulltext><thumbnail>$$Tsyndetics_thumb_exl</thumbnail><link.rule.ids>314,776,780,27901,27902</link.rule.ids></links><search><creatorcontrib>Fan, Qing</creatorcontrib><creatorcontrib>Shen, Xukun</creatorcontrib><creatorcontrib>Hu, Yong</creatorcontrib><creatorcontrib>Yu, Changjian</creatorcontrib><title>Simple very deep convolutional network for robust hand pose regression from a single depth image</title><title>Pattern recognition letters</title><description>•A dedicate network structure is presented to regress 3D hand pose from a single depth image.•We discuss the effect of the ConvNet depth on its accuracy under the hand pose regression setting.•We introduce batch normalization and a low-dimensional embedding to help hand pose estimation.•The proposed system is efficient enough with more than 500 fps on a single GPU.•Experiments results show that our method can get competitive results to state-of-the-art methods. We propose a novel approach for articulated hand pose estimation from a single depth image using a very deep convolutional network. For the first, a very deep network structure is designed to directly maps a single depth image to its corresponding 3D hand joint locations. This approach eliminates the necessity of hand-crafted intermediate features and sophisticated post-processing stages for robust and accurate hand pose estimation. We use Batch Normalization to accelerate training and prevent the objective function from getting stuck in poor local minima. We introduce a low-dimensional embedding forcing the network to learn the inherent constraints of hand joints, which helps to reduce the cost of reconstructing 3D hand poses from high-dimension feature space. We discuss the effect of the convolutional network depth on its accuracy under the hand pose regression setting. Quantitative assessments on two challenging datasets show that our proposed method gets competitive results to state-of-the-art approaches in terms of accuracy. Moreover, qualitative results also show that our proposed method is robust to some difficult hand poses.</description><subject>Artificial neural networks</subject><subject>Batch normalization</subject><subject>Deep convolutional network</subject><subject>Depth image</subject><subject>Embedding</subject><subject>Hand pose estimation</subject><subject>Low-dimensional embedding</subject><subject>Post-production processing</subject><subject>Robustness (mathematics)</subject><issn>0167-8655</issn><issn>1872-7344</issn><fulltext>true</fulltext><rsrctype>article</rsrctype><creationdate>2019</creationdate><recordtype>article</recordtype><recordid>eNp9kE9LxDAUxIMouK5-Aw8Bz61JkzbtRZDFf7DgQT3HNHndTe02NWlX9tubpZ49PRh-M8wbhK4pSSmhxW2bDmr0oNOMUBGllNDqBC1oKbJEMM5P0SJiIimLPD9HFyG0hJCCVeUCfb7Z3dAB3oM_YAMwYO36veum0bpedbiH8cf5L9w4j72rpzDireoNHlwA7GHjIYRI4sa7HVY42H4T0wwM4xbbndrAJTprVBfg6u8u0cfjw_vqOVm_Pr2s7teJZoyPSQF1TpioC0U0GFFwTWhWamM4yxphyqrhOSlrwpnROleV5kSLXIBoBFemKNkS3cy5g3ffE4RRtm7y8YUgM1oVMTznLFJ8prR3IXho5OBjTX-QlMjjlrKV85byuOVRjVtG291sg_jB3oKXQVvoY1Mb0VEaZ_8P-AX4SYBN</recordid><startdate>20190301</startdate><enddate>20190301</enddate><creator>Fan, Qing</creator><creator>Shen, Xukun</creator><creator>Hu, Yong</creator><creator>Yu, Changjian</creator><general>Elsevier B.V</general><general>Elsevier Science Ltd</general><scope>AAYXX</scope><scope>CITATION</scope><scope>7SC</scope><scope>7TK</scope><scope>8FD</scope><scope>JQ2</scope><scope>L7M</scope><scope>L~C</scope><scope>L~D</scope></search><sort><creationdate>20190301</creationdate><title>Simple very deep convolutional network for robust hand pose regression from a single depth image</title><author>Fan, Qing ; Shen, Xukun ; Hu, Yong ; Yu, Changjian</author></sort><facets><frbrtype>5</frbrtype><frbrgroupid>cdi_FETCH-LOGICAL-c334t-6eb5037b6a0ced764c0128cdd432f7d89f4508b043dcc5a9c40c757e7f74ad683</frbrgroupid><rsrctype>articles</rsrctype><prefilter>articles</prefilter><language>eng</language><creationdate>2019</creationdate><topic>Artificial neural networks</topic><topic>Batch normalization</topic><topic>Deep convolutional network</topic><topic>Depth image</topic><topic>Embedding</topic><topic>Hand pose estimation</topic><topic>Low-dimensional embedding</topic><topic>Post-production processing</topic><topic>Robustness (mathematics)</topic><toplevel>peer_reviewed</toplevel><toplevel>online_resources</toplevel><creatorcontrib>Fan, Qing</creatorcontrib><creatorcontrib>Shen, Xukun</creatorcontrib><creatorcontrib>Hu, Yong</creatorcontrib><creatorcontrib>Yu, Changjian</creatorcontrib><collection>CrossRef</collection><collection>Computer and Information Systems Abstracts</collection><collection>Neurosciences Abstracts</collection><collection>Technology Research Database</collection><collection>ProQuest Computer Science Collection</collection><collection>Advanced Technologies Database with Aerospace</collection><collection>Computer and Information Systems Abstracts Academic</collection><collection>Computer and Information Systems Abstracts Professional</collection><jtitle>Pattern recognition letters</jtitle></facets><delivery><delcategory>Remote Search Resource</delcategory><fulltext>fulltext</fulltext></delivery><addata><au>Fan, Qing</au><au>Shen, Xukun</au><au>Hu, Yong</au><au>Yu, Changjian</au><format>journal</format><genre>article</genre><ristype>JOUR</ristype><atitle>Simple very deep convolutional network for robust hand pose regression from a single depth image</atitle><jtitle>Pattern recognition letters</jtitle><date>2019-03-01</date><risdate>2019</risdate><volume>119</volume><spage>205</spage><epage>213</epage><pages>205-213</pages><issn>0167-8655</issn><eissn>1872-7344</eissn><abstract>•A dedicate network structure is presented to regress 3D hand pose from a single depth image.•We discuss the effect of the ConvNet depth on its accuracy under the hand pose regression setting.•We introduce batch normalization and a low-dimensional embedding to help hand pose estimation.•The proposed system is efficient enough with more than 500 fps on a single GPU.•Experiments results show that our method can get competitive results to state-of-the-art methods. We propose a novel approach for articulated hand pose estimation from a single depth image using a very deep convolutional network. For the first, a very deep network structure is designed to directly maps a single depth image to its corresponding 3D hand joint locations. This approach eliminates the necessity of hand-crafted intermediate features and sophisticated post-processing stages for robust and accurate hand pose estimation. We use Batch Normalization to accelerate training and prevent the objective function from getting stuck in poor local minima. We introduce a low-dimensional embedding forcing the network to learn the inherent constraints of hand joints, which helps to reduce the cost of reconstructing 3D hand poses from high-dimension feature space. We discuss the effect of the convolutional network depth on its accuracy under the hand pose regression setting. Quantitative assessments on two challenging datasets show that our proposed method gets competitive results to state-of-the-art approaches in terms of accuracy. Moreover, qualitative results also show that our proposed method is robust to some difficult hand poses.</abstract><cop>Amsterdam</cop><pub>Elsevier B.V</pub><doi>10.1016/j.patrec.2017.10.019</doi><tpages>9</tpages></addata></record>
fulltext	fulltext
identifier	ISSN: 0167-8655
ispartof	Pattern recognition letters, 2019-03, Vol.119, p.205-213
issn	0167-8655 1872-7344
language	eng
recordid	cdi_proquest_journals_2196503543
source	ScienceDirect Freedom Collection
subjects	Artificial neural networks Batch normalization Deep convolutional network Depth image Embedding Hand pose estimation Low-dimensional embedding Post-production processing Robustness (mathematics)
title	Simple very deep convolutional network for robust hand pose regression from a single depth image
url	http://sfxeu10.hosted.exlibrisgroup.com/loughborough?ctx_ver=Z39.88-2004&ctx_enc=info:ofi/enc:UTF-8&ctx_tim=2025-02-13T02%3A09%3A34IST&url_ver=Z39.88-2004&url_ctx_fmt=infofi/fmt:kev:mtx:ctx&rfr_id=info:sid/primo.exlibrisgroup.com:primo3-Article-proquest_cross&rft_val_fmt=info:ofi/fmt:kev:mtx:journal&rft.genre=article&rft.atitle=Simple%20very%20deep%20convolutional%20network%20for%20robust%20hand%20pose%20regression%20from%20a%20single%20depth%20image&rft.jtitle=Pattern%20recognition%20letters&rft.au=Fan,%20Qing&rft.date=2019-03-01&rft.volume=119&rft.spage=205&rft.epage=213&rft.pages=205-213&rft.issn=0167-8655&rft.eissn=1872-7344&rft_id=info:doi/10.1016/j.patrec.2017.10.019&rft_dat=%3Cproquest_cross%3E2196503543%3C/proquest_cross%3E%3Cgrp_id%3Ecdi_FETCH-LOGICAL-c334t-6eb5037b6a0ced764c0128cdd432f7d89f4508b043dcc5a9c40c757e7f74ad683%3C/grp_id%3E%3Coa%3E%3C/oa%3E%3Curl%3E%3C/url%3E&rft_id=info:oai/&rft_pqid=2196503543&rft_id=info:pmid/&rfr_iscdi=true