Loading…
SHPR-Net: Deep Semantic Hand Pose Regression From Point Clouds
3-D hand pose estimation is an essential problem for human-computer interaction. Most of the existing depth-based hand pose estimation methods consume 2-D depth map or 3-D volume via 2-D/3-D convolutional neural networks. In this paper, we propose a deep semantic hand pose regression network (SHPR-N...
Saved in:
Published in: | IEEE access 2018-01, Vol.6, p.43425-43439 |
---|---|
Main Authors: | , , , , |
Format: | Article |
Language: | English |
Subjects: | |
Citations: | Items that this one cites Items that cite this one |
Online Access: | Get full text |
Tags: |
Add Tag
No Tags, Be the first to tag this record!
|
cited_by | cdi_FETCH-LOGICAL-c408t-dce81915a360df2fa624335648b3b5eddb968e8adf52072496e6e6ac05867d093 |
---|---|
cites | cdi_FETCH-LOGICAL-c408t-dce81915a360df2fa624335648b3b5eddb968e8adf52072496e6e6ac05867d093 |
container_end_page | 43439 |
container_issue | |
container_start_page | 43425 |
container_title | IEEE access |
container_volume | 6 |
creator | Chen, Xinghao Wang, Guijin Zhang, Cairong Kim, Tae-Kyun Ji, Xiangyang |
description | 3-D hand pose estimation is an essential problem for human-computer interaction. Most of the existing depth-based hand pose estimation methods consume 2-D depth map or 3-D volume via 2-D/3-D convolutional neural networks. In this paper, we propose a deep semantic hand pose regression network (SHPR-Net) for hand pose estimation from point sets, which consists of two subnetworks: a semantic segmentation subnetwork and a hand pose regression subnetwork. The semantic segmentation network assigns semantic labels for each point in the point set. The pose regression network integrates the semantic priors with both input and late fusion strategy and regresses the final hand pose. Two transformation matrices are learned from the point set and applied to transform the input point cloud and inversely transform the output pose, respectively, which makes the SHPR-Net more robust to geometric transformations. Experiments on NYU, ICVL, and MSRA hand pose data sets demonstrate that our SHPR-Net achieves high performance on par with the start-of-the-art methods. We also show that our method can be naturally extended to hand pose estimation from the multi-view depth data and achieves further improvement on the NYU data set. |
doi_str_mv | 10.1109/ACCESS.2018.2863540 |
format | article |
fullrecord | <record><control><sourceid>proquest_doaj_</sourceid><recordid>TN_cdi_doaj_primary_oai_doaj_org_article_1432db650c854ce3a655736e31e33718</recordid><sourceformat>XML</sourceformat><sourcesystem>PC</sourcesystem><ieee_id>8425735</ieee_id><doaj_id>oai_doaj_org_article_1432db650c854ce3a655736e31e33718</doaj_id><sourcerecordid>2455908115</sourcerecordid><originalsourceid>FETCH-LOGICAL-c408t-dce81915a360df2fa624335648b3b5eddb968e8adf52072496e6e6ac05867d093</originalsourceid><addsrcrecordid>eNpNkF9rwjAUxcvYYOL8BL4U9lyX_033MJBOpyCb2O05pM2tVLRxSX3Yt19cRZY8JBzuOffwi6IxRhOMUfY0zfNZUUwIwnJCpKCcoZtoQLDIEsqpuP33v49G3u9QODJIPB1EL8VivUneoXuOXwGOcQEH3XZNFS90a-K19RBvYOvA-8a28dzZQxCbtovzvT0Z_xDd1XrvYXR5h9HXfPaZL5LVx9syn66SiiHZJaYCiTPMNRXI1KTWgjBKuWCypCUHY8pMSJDa1JyglLBMQLi6QlyK1KCMDqNln2us3qmjaw7a_SirG_UnWLdV2oXae1CYUWJKwVElOauAasF5SgVQDJSmWIasxz7r6Oz3CXyndvbk2lBfEcZ5FthgHqZoP1U5672D-roVI3Xmrnru6sxdXbgH17h3NQBwdUhGQgVOfwEdi3rG</addsrcrecordid><sourcetype>Open Website</sourcetype><iscdi>true</iscdi><recordtype>article</recordtype><pqid>2455908115</pqid></control><display><type>article</type><title>SHPR-Net: Deep Semantic Hand Pose Regression From Point Clouds</title><source>IEEE Xplore Open Access Journals</source><creator>Chen, Xinghao ; Wang, Guijin ; Zhang, Cairong ; Kim, Tae-Kyun ; Ji, Xiangyang</creator><creatorcontrib>Chen, Xinghao ; Wang, Guijin ; Zhang, Cairong ; Kim, Tae-Kyun ; Ji, Xiangyang</creatorcontrib><description>3-D hand pose estimation is an essential problem for human-computer interaction. Most of the existing depth-based hand pose estimation methods consume 2-D depth map or 3-D volume via 2-D/3-D convolutional neural networks. In this paper, we propose a deep semantic hand pose regression network (SHPR-Net) for hand pose estimation from point sets, which consists of two subnetworks: a semantic segmentation subnetwork and a hand pose regression subnetwork. The semantic segmentation network assigns semantic labels for each point in the point set. The pose regression network integrates the semantic priors with both input and late fusion strategy and regresses the final hand pose. Two transformation matrices are learned from the point set and applied to transform the input point cloud and inversely transform the output pose, respectively, which makes the SHPR-Net more robust to geometric transformations. Experiments on NYU, ICVL, and MSRA hand pose data sets demonstrate that our SHPR-Net achieves high performance on par with the start-of-the-art methods. We also show that our method can be naturally extended to hand pose estimation from the multi-view depth data and achieves further improvement on the NYU data set.</description><identifier>ISSN: 2169-3536</identifier><identifier>EISSN: 2169-3536</identifier><identifier>DOI: 10.1109/ACCESS.2018.2863540</identifier><identifier>CODEN: IAECCG</identifier><language>eng</language><publisher>Piscataway: IEEE</publisher><subject>Artificial neural networks ; Datasets ; deep learning ; Feature extraction ; Geometric transformation ; hand pose estimation ; Human computer interaction ; Machine learning ; point cloud ; Pose estimation ; Regression ; Semantic segmentation ; Semantics ; Three dimensional models ; Three-dimensional displays ; Transforms ; Two dimensional displays</subject><ispartof>IEEE access, 2018-01, Vol.6, p.43425-43439</ispartof><rights>Copyright The Institute of Electrical and Electronics Engineers, Inc. (IEEE) 2018</rights><lds50>peer_reviewed</lds50><oa>free_for_read</oa><woscitedreferencessubscribed>false</woscitedreferencessubscribed><citedby>FETCH-LOGICAL-c408t-dce81915a360df2fa624335648b3b5eddb968e8adf52072496e6e6ac05867d093</citedby><cites>FETCH-LOGICAL-c408t-dce81915a360df2fa624335648b3b5eddb968e8adf52072496e6e6ac05867d093</cites><orcidid>0000-0002-2131-3044</orcidid></display><links><openurl>$$Topenurl_article</openurl><openurlfulltext>$$Topenurlfull_article</openurlfulltext><thumbnail>$$Tsyndetics_thumb_exl</thumbnail><linktohtml>$$Uhttps://ieeexplore.ieee.org/document/8425735$$EHTML$$P50$$Gieee$$Hfree_for_read</linktohtml><link.rule.ids>314,780,784,27633,27924,27925,54933</link.rule.ids></links><search><creatorcontrib>Chen, Xinghao</creatorcontrib><creatorcontrib>Wang, Guijin</creatorcontrib><creatorcontrib>Zhang, Cairong</creatorcontrib><creatorcontrib>Kim, Tae-Kyun</creatorcontrib><creatorcontrib>Ji, Xiangyang</creatorcontrib><title>SHPR-Net: Deep Semantic Hand Pose Regression From Point Clouds</title><title>IEEE access</title><addtitle>Access</addtitle><description>3-D hand pose estimation is an essential problem for human-computer interaction. Most of the existing depth-based hand pose estimation methods consume 2-D depth map or 3-D volume via 2-D/3-D convolutional neural networks. In this paper, we propose a deep semantic hand pose regression network (SHPR-Net) for hand pose estimation from point sets, which consists of two subnetworks: a semantic segmentation subnetwork and a hand pose regression subnetwork. The semantic segmentation network assigns semantic labels for each point in the point set. The pose regression network integrates the semantic priors with both input and late fusion strategy and regresses the final hand pose. Two transformation matrices are learned from the point set and applied to transform the input point cloud and inversely transform the output pose, respectively, which makes the SHPR-Net more robust to geometric transformations. Experiments on NYU, ICVL, and MSRA hand pose data sets demonstrate that our SHPR-Net achieves high performance on par with the start-of-the-art methods. We also show that our method can be naturally extended to hand pose estimation from the multi-view depth data and achieves further improvement on the NYU data set.</description><subject>Artificial neural networks</subject><subject>Datasets</subject><subject>deep learning</subject><subject>Feature extraction</subject><subject>Geometric transformation</subject><subject>hand pose estimation</subject><subject>Human computer interaction</subject><subject>Machine learning</subject><subject>point cloud</subject><subject>Pose estimation</subject><subject>Regression</subject><subject>Semantic segmentation</subject><subject>Semantics</subject><subject>Three dimensional models</subject><subject>Three-dimensional displays</subject><subject>Transforms</subject><subject>Two dimensional displays</subject><issn>2169-3536</issn><issn>2169-3536</issn><fulltext>true</fulltext><rsrctype>article</rsrctype><creationdate>2018</creationdate><recordtype>article</recordtype><sourceid>ESBDL</sourceid><sourceid>DOA</sourceid><recordid>eNpNkF9rwjAUxcvYYOL8BL4U9lyX_033MJBOpyCb2O05pM2tVLRxSX3Yt19cRZY8JBzuOffwi6IxRhOMUfY0zfNZUUwIwnJCpKCcoZtoQLDIEsqpuP33v49G3u9QODJIPB1EL8VivUneoXuOXwGOcQEH3XZNFS90a-K19RBvYOvA-8a28dzZQxCbtovzvT0Z_xDd1XrvYXR5h9HXfPaZL5LVx9syn66SiiHZJaYCiTPMNRXI1KTWgjBKuWCypCUHY8pMSJDa1JyglLBMQLi6QlyK1KCMDqNln2us3qmjaw7a_SirG_UnWLdV2oXae1CYUWJKwVElOauAasF5SgVQDJSmWIasxz7r6Oz3CXyndvbk2lBfEcZ5FthgHqZoP1U5672D-roVI3Xmrnru6sxdXbgH17h3NQBwdUhGQgVOfwEdi3rG</recordid><startdate>20180101</startdate><enddate>20180101</enddate><creator>Chen, Xinghao</creator><creator>Wang, Guijin</creator><creator>Zhang, Cairong</creator><creator>Kim, Tae-Kyun</creator><creator>Ji, Xiangyang</creator><general>IEEE</general><general>The Institute of Electrical and Electronics Engineers, Inc. (IEEE)</general><scope>97E</scope><scope>ESBDL</scope><scope>RIA</scope><scope>RIE</scope><scope>AAYXX</scope><scope>CITATION</scope><scope>7SC</scope><scope>7SP</scope><scope>7SR</scope><scope>8BQ</scope><scope>8FD</scope><scope>JG9</scope><scope>JQ2</scope><scope>L7M</scope><scope>L~C</scope><scope>L~D</scope><scope>DOA</scope><orcidid>https://orcid.org/0000-0002-2131-3044</orcidid></search><sort><creationdate>20180101</creationdate><title>SHPR-Net: Deep Semantic Hand Pose Regression From Point Clouds</title><author>Chen, Xinghao ; Wang, Guijin ; Zhang, Cairong ; Kim, Tae-Kyun ; Ji, Xiangyang</author></sort><facets><frbrtype>5</frbrtype><frbrgroupid>cdi_FETCH-LOGICAL-c408t-dce81915a360df2fa624335648b3b5eddb968e8adf52072496e6e6ac05867d093</frbrgroupid><rsrctype>articles</rsrctype><prefilter>articles</prefilter><language>eng</language><creationdate>2018</creationdate><topic>Artificial neural networks</topic><topic>Datasets</topic><topic>deep learning</topic><topic>Feature extraction</topic><topic>Geometric transformation</topic><topic>hand pose estimation</topic><topic>Human computer interaction</topic><topic>Machine learning</topic><topic>point cloud</topic><topic>Pose estimation</topic><topic>Regression</topic><topic>Semantic segmentation</topic><topic>Semantics</topic><topic>Three dimensional models</topic><topic>Three-dimensional displays</topic><topic>Transforms</topic><topic>Two dimensional displays</topic><toplevel>peer_reviewed</toplevel><toplevel>online_resources</toplevel><creatorcontrib>Chen, Xinghao</creatorcontrib><creatorcontrib>Wang, Guijin</creatorcontrib><creatorcontrib>Zhang, Cairong</creatorcontrib><creatorcontrib>Kim, Tae-Kyun</creatorcontrib><creatorcontrib>Ji, Xiangyang</creatorcontrib><collection>IEEE All-Society Periodicals Package (ASPP) 2005–Present</collection><collection>IEEE Xplore Open Access Journals</collection><collection>IEEE All-Society Periodicals Package (ASPP) 1998-Present</collection><collection>IEEE/IET Electronic Library</collection><collection>CrossRef</collection><collection>Computer and Information Systems Abstracts</collection><collection>Electronics & Communications Abstracts</collection><collection>Engineered Materials Abstracts</collection><collection>METADEX</collection><collection>Technology Research Database</collection><collection>Materials Research Database</collection><collection>ProQuest Computer Science Collection</collection><collection>Advanced Technologies Database with Aerospace</collection><collection>Computer and Information Systems Abstracts Academic</collection><collection>Computer and Information Systems Abstracts Professional</collection><collection>DOAJ Directory of Open Access Journals</collection><jtitle>IEEE access</jtitle></facets><delivery><delcategory>Remote Search Resource</delcategory><fulltext>fulltext</fulltext></delivery><addata><au>Chen, Xinghao</au><au>Wang, Guijin</au><au>Zhang, Cairong</au><au>Kim, Tae-Kyun</au><au>Ji, Xiangyang</au><format>journal</format><genre>article</genre><ristype>JOUR</ristype><atitle>SHPR-Net: Deep Semantic Hand Pose Regression From Point Clouds</atitle><jtitle>IEEE access</jtitle><stitle>Access</stitle><date>2018-01-01</date><risdate>2018</risdate><volume>6</volume><spage>43425</spage><epage>43439</epage><pages>43425-43439</pages><issn>2169-3536</issn><eissn>2169-3536</eissn><coden>IAECCG</coden><abstract>3-D hand pose estimation is an essential problem for human-computer interaction. Most of the existing depth-based hand pose estimation methods consume 2-D depth map or 3-D volume via 2-D/3-D convolutional neural networks. In this paper, we propose a deep semantic hand pose regression network (SHPR-Net) for hand pose estimation from point sets, which consists of two subnetworks: a semantic segmentation subnetwork and a hand pose regression subnetwork. The semantic segmentation network assigns semantic labels for each point in the point set. The pose regression network integrates the semantic priors with both input and late fusion strategy and regresses the final hand pose. Two transformation matrices are learned from the point set and applied to transform the input point cloud and inversely transform the output pose, respectively, which makes the SHPR-Net more robust to geometric transformations. Experiments on NYU, ICVL, and MSRA hand pose data sets demonstrate that our SHPR-Net achieves high performance on par with the start-of-the-art methods. We also show that our method can be naturally extended to hand pose estimation from the multi-view depth data and achieves further improvement on the NYU data set.</abstract><cop>Piscataway</cop><pub>IEEE</pub><doi>10.1109/ACCESS.2018.2863540</doi><tpages>15</tpages><orcidid>https://orcid.org/0000-0002-2131-3044</orcidid><oa>free_for_read</oa></addata></record> |
fulltext | fulltext |
identifier | ISSN: 2169-3536 |
ispartof | IEEE access, 2018-01, Vol.6, p.43425-43439 |
issn | 2169-3536 2169-3536 |
language | eng |
recordid | cdi_doaj_primary_oai_doaj_org_article_1432db650c854ce3a655736e31e33718 |
source | IEEE Xplore Open Access Journals |
subjects | Artificial neural networks Datasets deep learning Feature extraction Geometric transformation hand pose estimation Human computer interaction Machine learning point cloud Pose estimation Regression Semantic segmentation Semantics Three dimensional models Three-dimensional displays Transforms Two dimensional displays |
title | SHPR-Net: Deep Semantic Hand Pose Regression From Point Clouds |
url | http://sfxeu10.hosted.exlibrisgroup.com/loughborough?ctx_ver=Z39.88-2004&ctx_enc=info:ofi/enc:UTF-8&ctx_tim=2025-01-07T20%3A56%3A34IST&url_ver=Z39.88-2004&url_ctx_fmt=infofi/fmt:kev:mtx:ctx&rfr_id=info:sid/primo.exlibrisgroup.com:primo3-Article-proquest_doaj_&rft_val_fmt=info:ofi/fmt:kev:mtx:journal&rft.genre=article&rft.atitle=SHPR-Net:%20Deep%20Semantic%20Hand%20Pose%20Regression%20From%20Point%20Clouds&rft.jtitle=IEEE%20access&rft.au=Chen,%20Xinghao&rft.date=2018-01-01&rft.volume=6&rft.spage=43425&rft.epage=43439&rft.pages=43425-43439&rft.issn=2169-3536&rft.eissn=2169-3536&rft.coden=IAECCG&rft_id=info:doi/10.1109/ACCESS.2018.2863540&rft_dat=%3Cproquest_doaj_%3E2455908115%3C/proquest_doaj_%3E%3Cgrp_id%3Ecdi_FETCH-LOGICAL-c408t-dce81915a360df2fa624335648b3b5eddb968e8adf52072496e6e6ac05867d093%3C/grp_id%3E%3Coa%3E%3C/oa%3E%3Curl%3E%3C/url%3E&rft_id=info:oai/&rft_pqid=2455908115&rft_id=info:pmid/&rft_ieee_id=8425735&rfr_iscdi=true |