Loading…

Learning Hierarchical Models for Class-Specific Reconstruction from Natural Data

We propose a novel method for class-specific, single-view, object detection, pose estimation and deformable 3D reconstruction, where a two-pronged (sparse semantic and dense shape) representation is learned from natural image data automatically. Then, given a new image, it can estimate camera pose a...

Full description

Saved in:

Bibliographic Details
Main Authors:	Kumar, Arun C.S., Bhandarkar, Suchendra M., Prasad, Mukta
Format:	Conference Proceeding
Language:	English
Subjects:	Deformable models Image reconstruction Shape Solid modeling Strain Three-dimensional displays Two dimensional displays
Online Access:	Request full text
Tags:	Add Tag No Tags, Be the first to tag this record!

cited_by
cites
container_end_page	11708
container_issue
container_start_page	1170
container_title
container_volume
creator	Kumar, Arun C.S. Bhandarkar, Suchendra M. Prasad, Mukta
description	We propose a novel method for class-specific, single-view, object detection, pose estimation and deformable 3D reconstruction, where a two-pronged (sparse semantic and dense shape) representation is learned from natural image data automatically. Then, given a new image, it can estimate camera pose and deformable reconstruction using an effective, incremental optimization. Our method extracts a continuous, scaled-orthographic pose (without resorting to regression and/or discretized 1D azimuth-based representations). The method reconstructs a full free-form shape (rather than retrieving the closest 3D CAD shape proxy, typical in state-of-the-art). We learn our two-pronged model purely from natural image data, as automatically and faithfully as possible, reducing the human effort and bias typical to this problem. The pipeline combines data-driven deep learning based semantic part learning with principled modelling and effective optimization of the problem's physics, shape deformation, pose and occlusion. The underlying sparse (part-based) representation of the object is computationally efficient for purposes like detection and discriminative tasks, whereas the overlaid dense (skin like) representation, models and realistically renders comprehensive 3D structure including natural deformation, occlusion. The results for the car class are visually pleasing, and importantly, outperform the state-of-the-art quantitatively too. Our contribution to visual scene understanding through the two-pronged object representation shows promise for more accurate 3D scene understanding for real world applications on virtual/mixed reality, autonomous navigation, to cite a few.
doi_str_mv	10.1109/CVPRW.2018.00153
format	conference_proceeding
fullrecord	<record><control><sourceid>ieee_CHZPO</sourceid><recordid>TN_cdi_ieee_primary_8575307</recordid><sourceformat>XML</sourceformat><sourcesystem>PC</sourcesystem><ieee_id>8575307</ieee_id><sourcerecordid>8575307</sourcerecordid><originalsourceid>FETCH-LOGICAL-i175t-4a92a4a14f90595f7612fc9d9cfb0eed633d5ae331bb85273584bb5379272ebb3</originalsourceid><addsrcrecordid>eNotjF1LwzAYRqMgOGbvBW_yB1rfJE2TXEr9mDB1zK_L8SZNNNK1I-ku9u8d6NVzOHAeQi4ZVIyBuW4_VuvPigPTFQCT4oQURukj6KZhAHBKZpw1UCrJmnNS5PxzlAy0lEbMyGrpMQ1x-KKL6BMm9x0d9vRp7HyfaRgTbXvMuXzdeRdDdHTt3TjkKe3dFMeBhjRu6TNO-3SsbnHCC3IWsM---N85eb-_e2sX5fLl4bG9WZaRKTmVNRqONbI6GJBGBtUwHpzpjAsWvO8aITqJXghmrZZcCalra6VQhivurRVzcvX3G733m12KW0yHjZZKClDiF_7oT90</addsrcrecordid><sourcetype>Publisher</sourcetype><iscdi>true</iscdi><recordtype>conference_proceeding</recordtype></control><display><type>conference_proceeding</type><title>Learning Hierarchical Models for Class-Specific Reconstruction from Natural Data</title><source>IEEE Xplore All Conference Series</source><creator>Kumar, Arun C.S. ; Bhandarkar, Suchendra M. ; Prasad, Mukta</creator><creatorcontrib>Kumar, Arun C.S. ; Bhandarkar, Suchendra M. ; Prasad, Mukta</creatorcontrib><description>We propose a novel method for class-specific, single-view, object detection, pose estimation and deformable 3D reconstruction, where a two-pronged (sparse semantic and dense shape) representation is learned from natural image data automatically. Then, given a new image, it can estimate camera pose and deformable reconstruction using an effective, incremental optimization. Our method extracts a continuous, scaled-orthographic pose (without resorting to regression and/or discretized 1D azimuth-based representations). The method reconstructs a full free-form shape (rather than retrieving the closest 3D CAD shape proxy, typical in state-of-the-art). We learn our two-pronged model purely from natural image data, as automatically and faithfully as possible, reducing the human effort and bias typical to this problem. The pipeline combines data-driven deep learning based semantic part learning with principled modelling and effective optimization of the problem's physics, shape deformation, pose and occlusion. The underlying sparse (part-based) representation of the object is computationally efficient for purposes like detection and discriminative tasks, whereas the overlaid dense (skin like) representation, models and realistically renders comprehensive 3D structure including natural deformation, occlusion. The results for the car class are visually pleasing, and importantly, outperform the state-of-the-art quantitatively too. Our contribution to visual scene understanding through the two-pronged object representation shows promise for more accurate 3D scene understanding for real world applications on virtual/mixed reality, autonomous navigation, to cite a few.</description><identifier>EISSN: 2160-7516</identifier><identifier>EISBN: 9781538661000</identifier><identifier>EISBN: 1538661004</identifier><identifier>DOI: 10.1109/CVPRW.2018.00153</identifier><identifier>CODEN: IEEPAD</identifier><language>eng</language><publisher>IEEE</publisher><subject>Deformable models ; Image reconstruction ; Shape ; Solid modeling ; Strain ; Three-dimensional displays ; Two dimensional displays</subject><ispartof>2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition Workshops (CVPRW), 2018, p.1170-11708</ispartof><woscitedreferencessubscribed>false</woscitedreferencessubscribed></display><links><openurl>$$Topenurl_article</openurl><openurlfulltext>$$Topenurlfull_article</openurlfulltext><thumbnail>$$Tsyndetics_thumb_exl</thumbnail><linktohtml>$$Uhttps://ieeexplore.ieee.org/document/8575307$$EHTML$$P50$$Gieee$$H</linktohtml><link.rule.ids>309,310,780,784,789,790,27925,54555,54932</link.rule.ids><linktorsrc>$$Uhttps://ieeexplore.ieee.org/document/8575307$$EView_record_in_IEEE$$FView_record_in_$$GIEEE</linktorsrc></links><search><creatorcontrib>Kumar, Arun C.S.</creatorcontrib><creatorcontrib>Bhandarkar, Suchendra M.</creatorcontrib><creatorcontrib>Prasad, Mukta</creatorcontrib><title>Learning Hierarchical Models for Class-Specific Reconstruction from Natural Data</title><title>2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition Workshops (CVPRW)</title><addtitle>CVPRW</addtitle><description>We propose a novel method for class-specific, single-view, object detection, pose estimation and deformable 3D reconstruction, where a two-pronged (sparse semantic and dense shape) representation is learned from natural image data automatically. Then, given a new image, it can estimate camera pose and deformable reconstruction using an effective, incremental optimization. Our method extracts a continuous, scaled-orthographic pose (without resorting to regression and/or discretized 1D azimuth-based representations). The method reconstructs a full free-form shape (rather than retrieving the closest 3D CAD shape proxy, typical in state-of-the-art). We learn our two-pronged model purely from natural image data, as automatically and faithfully as possible, reducing the human effort and bias typical to this problem. The pipeline combines data-driven deep learning based semantic part learning with principled modelling and effective optimization of the problem's physics, shape deformation, pose and occlusion. The underlying sparse (part-based) representation of the object is computationally efficient for purposes like detection and discriminative tasks, whereas the overlaid dense (skin like) representation, models and realistically renders comprehensive 3D structure including natural deformation, occlusion. The results for the car class are visually pleasing, and importantly, outperform the state-of-the-art quantitatively too. Our contribution to visual scene understanding through the two-pronged object representation shows promise for more accurate 3D scene understanding for real world applications on virtual/mixed reality, autonomous navigation, to cite a few.</description><subject>Deformable models</subject><subject>Image reconstruction</subject><subject>Shape</subject><subject>Solid modeling</subject><subject>Strain</subject><subject>Three-dimensional displays</subject><subject>Two dimensional displays</subject><issn>2160-7516</issn><isbn>9781538661000</isbn><isbn>1538661004</isbn><fulltext>true</fulltext><rsrctype>conference_proceeding</rsrctype><creationdate>2018</creationdate><recordtype>conference_proceeding</recordtype><sourceid>6IE</sourceid><recordid>eNotjF1LwzAYRqMgOGbvBW_yB1rfJE2TXEr9mDB1zK_L8SZNNNK1I-ku9u8d6NVzOHAeQi4ZVIyBuW4_VuvPigPTFQCT4oQURukj6KZhAHBKZpw1UCrJmnNS5PxzlAy0lEbMyGrpMQ1x-KKL6BMm9x0d9vRp7HyfaRgTbXvMuXzdeRdDdHTt3TjkKe3dFMeBhjRu6TNO-3SsbnHCC3IWsM---N85eb-_e2sX5fLl4bG9WZaRKTmVNRqONbI6GJBGBtUwHpzpjAsWvO8aITqJXghmrZZcCalra6VQhivurRVzcvX3G733m12KW0yHjZZKClDiF_7oT90</recordid><startdate>201806</startdate><enddate>201806</enddate><creator>Kumar, Arun C.S.</creator><creator>Bhandarkar, Suchendra M.</creator><creator>Prasad, Mukta</creator><general>IEEE</general><scope>6IE</scope><scope>6IL</scope><scope>CBEJK</scope><scope>RIE</scope><scope>RIL</scope></search><sort><creationdate>201806</creationdate><title>Learning Hierarchical Models for Class-Specific Reconstruction from Natural Data</title><author>Kumar, Arun C.S. ; Bhandarkar, Suchendra M. ; Prasad, Mukta</author></sort><facets><frbrtype>5</frbrtype><frbrgroupid>cdi_FETCH-LOGICAL-i175t-4a92a4a14f90595f7612fc9d9cfb0eed633d5ae331bb85273584bb5379272ebb3</frbrgroupid><rsrctype>conference_proceedings</rsrctype><prefilter>conference_proceedings</prefilter><language>eng</language><creationdate>2018</creationdate><topic>Deformable models</topic><topic>Image reconstruction</topic><topic>Shape</topic><topic>Solid modeling</topic><topic>Strain</topic><topic>Three-dimensional displays</topic><topic>Two dimensional displays</topic><toplevel>online_resources</toplevel><creatorcontrib>Kumar, Arun C.S.</creatorcontrib><creatorcontrib>Bhandarkar, Suchendra M.</creatorcontrib><creatorcontrib>Prasad, Mukta</creatorcontrib><collection>IEEE Electronic Library (IEL) Conference Proceedings</collection><collection>IEEE Proceedings Order Plan All Online (POP All Online) 1998-present by volume</collection><collection>IEEE Xplore All Conference Proceedings</collection><collection>IEL</collection><collection>IEEE Proceedings Order Plans (POP All) 1998-Present</collection></facets><delivery><delcategory>Remote Search Resource</delcategory><fulltext>fulltext_linktorsrc</fulltext></delivery><addata><au>Kumar, Arun C.S.</au><au>Bhandarkar, Suchendra M.</au><au>Prasad, Mukta</au><format>book</format><genre>proceeding</genre><ristype>CONF</ristype><atitle>Learning Hierarchical Models for Class-Specific Reconstruction from Natural Data</atitle><btitle>2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition Workshops (CVPRW)</btitle><stitle>CVPRW</stitle><date>2018-06</date><risdate>2018</risdate><spage>1170</spage><epage>11708</epage><pages>1170-11708</pages><eissn>2160-7516</eissn><eisbn>9781538661000</eisbn><eisbn>1538661004</eisbn><coden>IEEPAD</coden><abstract>We propose a novel method for class-specific, single-view, object detection, pose estimation and deformable 3D reconstruction, where a two-pronged (sparse semantic and dense shape) representation is learned from natural image data automatically. Then, given a new image, it can estimate camera pose and deformable reconstruction using an effective, incremental optimization. Our method extracts a continuous, scaled-orthographic pose (without resorting to regression and/or discretized 1D azimuth-based representations). The method reconstructs a full free-form shape (rather than retrieving the closest 3D CAD shape proxy, typical in state-of-the-art). We learn our two-pronged model purely from natural image data, as automatically and faithfully as possible, reducing the human effort and bias typical to this problem. The pipeline combines data-driven deep learning based semantic part learning with principled modelling and effective optimization of the problem's physics, shape deformation, pose and occlusion. The underlying sparse (part-based) representation of the object is computationally efficient for purposes like detection and discriminative tasks, whereas the overlaid dense (skin like) representation, models and realistically renders comprehensive 3D structure including natural deformation, occlusion. The results for the car class are visually pleasing, and importantly, outperform the state-of-the-art quantitatively too. Our contribution to visual scene understanding through the two-pronged object representation shows promise for more accurate 3D scene understanding for real world applications on virtual/mixed reality, autonomous navigation, to cite a few.</abstract><pub>IEEE</pub><doi>10.1109/CVPRW.2018.00153</doi><tpages>10539</tpages></addata></record>
fulltext	fulltext_linktorsrc
identifier	EISSN: 2160-7516
ispartof	2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition Workshops (CVPRW), 2018, p.1170-11708
issn	2160-7516
language	eng
recordid	cdi_ieee_primary_8575307
source	IEEE Xplore All Conference Series
subjects	Deformable models Image reconstruction Shape Solid modeling Strain Three-dimensional displays Two dimensional displays
title	Learning Hierarchical Models for Class-Specific Reconstruction from Natural Data
url	http://sfxeu10.hosted.exlibrisgroup.com/loughborough?ctx_ver=Z39.88-2004&ctx_enc=info:ofi/enc:UTF-8&ctx_tim=2025-01-01T09%3A04%3A13IST&url_ver=Z39.88-2004&url_ctx_fmt=infofi/fmt:kev:mtx:ctx&rfr_id=info:sid/primo.exlibrisgroup.com:primo3-Article-ieee_CHZPO&rft_val_fmt=info:ofi/fmt:kev:mtx:book&rft.genre=proceeding&rft.atitle=Learning%20Hierarchical%20Models%20for%20Class-Specific%20Reconstruction%20from%20Natural%20Data&rft.btitle=2018%20IEEE/CVF%20Conference%20on%20Computer%20Vision%20and%20Pattern%20Recognition%20Workshops%20(CVPRW)&rft.au=Kumar,%20Arun%20C.S.&rft.date=2018-06&rft.spage=1170&rft.epage=11708&rft.pages=1170-11708&rft.eissn=2160-7516&rft.coden=IEEPAD&rft_id=info:doi/10.1109/CVPRW.2018.00153&rft.eisbn=9781538661000&rft.eisbn_list=1538661004&rft_dat=%3Cieee_CHZPO%3E8575307%3C/ieee_CHZPO%3E%3Cgrp_id%3Ecdi_FETCH-LOGICAL-i175t-4a92a4a14f90595f7612fc9d9cfb0eed633d5ae331bb85273584bb5379272ebb3%3C/grp_id%3E%3Coa%3E%3C/oa%3E%3Curl%3E%3C/url%3E&rft_id=info:oai/&rft_id=info:pmid/&rft_ieee_id=8575307&rfr_iscdi=true