Loading…

Learning Hierarchical Models for Class-Specific Reconstruction from Natural Data

We propose a novel method for class-specific, single-view, object detection, pose estimation and deformable 3D reconstruction, where a two-pronged (sparse semantic and dense shape) representation is learned from natural image data automatically. Then, given a new image, it can estimate camera pose a...

Full description

Saved in:
Bibliographic Details
Main Authors: Kumar, Arun C.S., Bhandarkar, Suchendra M., Prasad, Mukta
Format: Conference Proceeding
Language:English
Subjects:
Online Access:Request full text
Tags: Add Tag
No Tags, Be the first to tag this record!
cited_by
cites
container_end_page 11708
container_issue
container_start_page 1170
container_title
container_volume
creator Kumar, Arun C.S.
Bhandarkar, Suchendra M.
Prasad, Mukta
description We propose a novel method for class-specific, single-view, object detection, pose estimation and deformable 3D reconstruction, where a two-pronged (sparse semantic and dense shape) representation is learned from natural image data automatically. Then, given a new image, it can estimate camera pose and deformable reconstruction using an effective, incremental optimization. Our method extracts a continuous, scaled-orthographic pose (without resorting to regression and/or discretized 1D azimuth-based representations). The method reconstructs a full free-form shape (rather than retrieving the closest 3D CAD shape proxy, typical in state-of-the-art). We learn our two-pronged model purely from natural image data, as automatically and faithfully as possible, reducing the human effort and bias typical to this problem. The pipeline combines data-driven deep learning based semantic part learning with principled modelling and effective optimization of the problem's physics, shape deformation, pose and occlusion. The underlying sparse (part-based) representation of the object is computationally efficient for purposes like detection and discriminative tasks, whereas the overlaid dense (skin like) representation, models and realistically renders comprehensive 3D structure including natural deformation, occlusion. The results for the car class are visually pleasing, and importantly, outperform the state-of-the-art quantitatively too. Our contribution to visual scene understanding through the two-pronged object representation shows promise for more accurate 3D scene understanding for real world applications on virtual/mixed reality, autonomous navigation, to cite a few.
doi_str_mv 10.1109/CVPRW.2018.00153
format conference_proceeding
fullrecord <record><control><sourceid>ieee_CHZPO</sourceid><recordid>TN_cdi_ieee_primary_8575307</recordid><sourceformat>XML</sourceformat><sourcesystem>PC</sourcesystem><ieee_id>8575307</ieee_id><sourcerecordid>8575307</sourcerecordid><originalsourceid>FETCH-LOGICAL-i175t-4a92a4a14f90595f7612fc9d9cfb0eed633d5ae331bb85273584bb5379272ebb3</originalsourceid><addsrcrecordid>eNotjF1LwzAYRqMgOGbvBW_yB1rfJE2TXEr9mDB1zK_L8SZNNNK1I-ku9u8d6NVzOHAeQi4ZVIyBuW4_VuvPigPTFQCT4oQURukj6KZhAHBKZpw1UCrJmnNS5PxzlAy0lEbMyGrpMQ1x-KKL6BMm9x0d9vRp7HyfaRgTbXvMuXzdeRdDdHTt3TjkKe3dFMeBhjRu6TNO-3SsbnHCC3IWsM---N85eb-_e2sX5fLl4bG9WZaRKTmVNRqONbI6GJBGBtUwHpzpjAsWvO8aITqJXghmrZZcCalra6VQhivurRVzcvX3G733m12KW0yHjZZKClDiF_7oT90</addsrcrecordid><sourcetype>Publisher</sourcetype><iscdi>true</iscdi><recordtype>conference_proceeding</recordtype></control><display><type>conference_proceeding</type><title>Learning Hierarchical Models for Class-Specific Reconstruction from Natural Data</title><source>IEEE Xplore All Conference Series</source><creator>Kumar, Arun C.S. ; Bhandarkar, Suchendra M. ; Prasad, Mukta</creator><creatorcontrib>Kumar, Arun C.S. ; Bhandarkar, Suchendra M. ; Prasad, Mukta</creatorcontrib><description>We propose a novel method for class-specific, single-view, object detection, pose estimation and deformable 3D reconstruction, where a two-pronged (sparse semantic and dense shape) representation is learned from natural image data automatically. Then, given a new image, it can estimate camera pose and deformable reconstruction using an effective, incremental optimization. Our method extracts a continuous, scaled-orthographic pose (without resorting to regression and/or discretized 1D azimuth-based representations). The method reconstructs a full free-form shape (rather than retrieving the closest 3D CAD shape proxy, typical in state-of-the-art). We learn our two-pronged model purely from natural image data, as automatically and faithfully as possible, reducing the human effort and bias typical to this problem. The pipeline combines data-driven deep learning based semantic part learning with principled modelling and effective optimization of the problem's physics, shape deformation, pose and occlusion. The underlying sparse (part-based) representation of the object is computationally efficient for purposes like detection and discriminative tasks, whereas the overlaid dense (skin like) representation, models and realistically renders comprehensive 3D structure including natural deformation, occlusion. The results for the car class are visually pleasing, and importantly, outperform the state-of-the-art quantitatively too. Our contribution to visual scene understanding through the two-pronged object representation shows promise for more accurate 3D scene understanding for real world applications on virtual/mixed reality, autonomous navigation, to cite a few.</description><identifier>EISSN: 2160-7516</identifier><identifier>EISBN: 9781538661000</identifier><identifier>EISBN: 1538661004</identifier><identifier>DOI: 10.1109/CVPRW.2018.00153</identifier><identifier>CODEN: IEEPAD</identifier><language>eng</language><publisher>IEEE</publisher><subject>Deformable models ; Image reconstruction ; Shape ; Solid modeling ; Strain ; Three-dimensional displays ; Two dimensional displays</subject><ispartof>2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition Workshops (CVPRW), 2018, p.1170-11708</ispartof><woscitedreferencessubscribed>false</woscitedreferencessubscribed></display><links><openurl>$$Topenurl_article</openurl><openurlfulltext>$$Topenurlfull_article</openurlfulltext><thumbnail>$$Tsyndetics_thumb_exl</thumbnail><linktohtml>$$Uhttps://ieeexplore.ieee.org/document/8575307$$EHTML$$P50$$Gieee$$H</linktohtml><link.rule.ids>309,310,780,784,789,790,27925,54555,54932</link.rule.ids><linktorsrc>$$Uhttps://ieeexplore.ieee.org/document/8575307$$EView_record_in_IEEE$$FView_record_in_$$GIEEE</linktorsrc></links><search><creatorcontrib>Kumar, Arun C.S.</creatorcontrib><creatorcontrib>Bhandarkar, Suchendra M.</creatorcontrib><creatorcontrib>Prasad, Mukta</creatorcontrib><title>Learning Hierarchical Models for Class-Specific Reconstruction from Natural Data</title><title>2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition Workshops (CVPRW)</title><addtitle>CVPRW</addtitle><description>We propose a novel method for class-specific, single-view, object detection, pose estimation and deformable 3D reconstruction, where a two-pronged (sparse semantic and dense shape) representation is learned from natural image data automatically. Then, given a new image, it can estimate camera pose and deformable reconstruction using an effective, incremental optimization. Our method extracts a continuous, scaled-orthographic pose (without resorting to regression and/or discretized 1D azimuth-based representations). The method reconstructs a full free-form shape (rather than retrieving the closest 3D CAD shape proxy, typical in state-of-the-art). We learn our two-pronged model purely from natural image data, as automatically and faithfully as possible, reducing the human effort and bias typical to this problem. The pipeline combines data-driven deep learning based semantic part learning with principled modelling and effective optimization of the problem's physics, shape deformation, pose and occlusion. The underlying sparse (part-based) representation of the object is computationally efficient for purposes like detection and discriminative tasks, whereas the overlaid dense (skin like) representation, models and realistically renders comprehensive 3D structure including natural deformation, occlusion. The results for the car class are visually pleasing, and importantly, outperform the state-of-the-art quantitatively too. Our contribution to visual scene understanding through the two-pronged object representation shows promise for more accurate 3D scene understanding for real world applications on virtual/mixed reality, autonomous navigation, to cite a few.</description><subject>Deformable models</subject><subject>Image reconstruction</subject><subject>Shape</subject><subject>Solid modeling</subject><subject>Strain</subject><subject>Three-dimensional displays</subject><subject>Two dimensional displays</subject><issn>2160-7516</issn><isbn>9781538661000</isbn><isbn>1538661004</isbn><fulltext>true</fulltext><rsrctype>conference_proceeding</rsrctype><creationdate>2018</creationdate><recordtype>conference_proceeding</recordtype><sourceid>6IE</sourceid><recordid>eNotjF1LwzAYRqMgOGbvBW_yB1rfJE2TXEr9mDB1zK_L8SZNNNK1I-ku9u8d6NVzOHAeQi4ZVIyBuW4_VuvPigPTFQCT4oQURukj6KZhAHBKZpw1UCrJmnNS5PxzlAy0lEbMyGrpMQ1x-KKL6BMm9x0d9vRp7HyfaRgTbXvMuXzdeRdDdHTt3TjkKe3dFMeBhjRu6TNO-3SsbnHCC3IWsM---N85eb-_e2sX5fLl4bG9WZaRKTmVNRqONbI6GJBGBtUwHpzpjAsWvO8aITqJXghmrZZcCalra6VQhivurRVzcvX3G733m12KW0yHjZZKClDiF_7oT90</recordid><startdate>201806</startdate><enddate>201806</enddate><creator>Kumar, Arun C.S.</creator><creator>Bhandarkar, Suchendra M.</creator><creator>Prasad, Mukta</creator><general>IEEE</general><scope>6IE</scope><scope>6IL</scope><scope>CBEJK</scope><scope>RIE</scope><scope>RIL</scope></search><sort><creationdate>201806</creationdate><title>Learning Hierarchical Models for Class-Specific Reconstruction from Natural Data</title><author>Kumar, Arun C.S. ; Bhandarkar, Suchendra M. ; Prasad, Mukta</author></sort><facets><frbrtype>5</frbrtype><frbrgroupid>cdi_FETCH-LOGICAL-i175t-4a92a4a14f90595f7612fc9d9cfb0eed633d5ae331bb85273584bb5379272ebb3</frbrgroupid><rsrctype>conference_proceedings</rsrctype><prefilter>conference_proceedings</prefilter><language>eng</language><creationdate>2018</creationdate><topic>Deformable models</topic><topic>Image reconstruction</topic><topic>Shape</topic><topic>Solid modeling</topic><topic>Strain</topic><topic>Three-dimensional displays</topic><topic>Two dimensional displays</topic><toplevel>online_resources</toplevel><creatorcontrib>Kumar, Arun C.S.</creatorcontrib><creatorcontrib>Bhandarkar, Suchendra M.</creatorcontrib><creatorcontrib>Prasad, Mukta</creatorcontrib><collection>IEEE Electronic Library (IEL) Conference Proceedings</collection><collection>IEEE Proceedings Order Plan All Online (POP All Online) 1998-present by volume</collection><collection>IEEE Xplore All Conference Proceedings</collection><collection>IEL</collection><collection>IEEE Proceedings Order Plans (POP All) 1998-Present</collection></facets><delivery><delcategory>Remote Search Resource</delcategory><fulltext>fulltext_linktorsrc</fulltext></delivery><addata><au>Kumar, Arun C.S.</au><au>Bhandarkar, Suchendra M.</au><au>Prasad, Mukta</au><format>book</format><genre>proceeding</genre><ristype>CONF</ristype><atitle>Learning Hierarchical Models for Class-Specific Reconstruction from Natural Data</atitle><btitle>2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition Workshops (CVPRW)</btitle><stitle>CVPRW</stitle><date>2018-06</date><risdate>2018</risdate><spage>1170</spage><epage>11708</epage><pages>1170-11708</pages><eissn>2160-7516</eissn><eisbn>9781538661000</eisbn><eisbn>1538661004</eisbn><coden>IEEPAD</coden><abstract>We propose a novel method for class-specific, single-view, object detection, pose estimation and deformable 3D reconstruction, where a two-pronged (sparse semantic and dense shape) representation is learned from natural image data automatically. Then, given a new image, it can estimate camera pose and deformable reconstruction using an effective, incremental optimization. Our method extracts a continuous, scaled-orthographic pose (without resorting to regression and/or discretized 1D azimuth-based representations). The method reconstructs a full free-form shape (rather than retrieving the closest 3D CAD shape proxy, typical in state-of-the-art). We learn our two-pronged model purely from natural image data, as automatically and faithfully as possible, reducing the human effort and bias typical to this problem. The pipeline combines data-driven deep learning based semantic part learning with principled modelling and effective optimization of the problem's physics, shape deformation, pose and occlusion. The underlying sparse (part-based) representation of the object is computationally efficient for purposes like detection and discriminative tasks, whereas the overlaid dense (skin like) representation, models and realistically renders comprehensive 3D structure including natural deformation, occlusion. The results for the car class are visually pleasing, and importantly, outperform the state-of-the-art quantitatively too. Our contribution to visual scene understanding through the two-pronged object representation shows promise for more accurate 3D scene understanding for real world applications on virtual/mixed reality, autonomous navigation, to cite a few.</abstract><pub>IEEE</pub><doi>10.1109/CVPRW.2018.00153</doi><tpages>10539</tpages></addata></record>
fulltext fulltext_linktorsrc
identifier EISSN: 2160-7516
ispartof 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition Workshops (CVPRW), 2018, p.1170-11708
issn 2160-7516
language eng
recordid cdi_ieee_primary_8575307
source IEEE Xplore All Conference Series
subjects Deformable models
Image reconstruction
Shape
Solid modeling
Strain
Three-dimensional displays
Two dimensional displays
title Learning Hierarchical Models for Class-Specific Reconstruction from Natural Data
url http://sfxeu10.hosted.exlibrisgroup.com/loughborough?ctx_ver=Z39.88-2004&ctx_enc=info:ofi/enc:UTF-8&ctx_tim=2025-01-01T09%3A04%3A13IST&url_ver=Z39.88-2004&url_ctx_fmt=infofi/fmt:kev:mtx:ctx&rfr_id=info:sid/primo.exlibrisgroup.com:primo3-Article-ieee_CHZPO&rft_val_fmt=info:ofi/fmt:kev:mtx:book&rft.genre=proceeding&rft.atitle=Learning%20Hierarchical%20Models%20for%20Class-Specific%20Reconstruction%20from%20Natural%20Data&rft.btitle=2018%20IEEE/CVF%20Conference%20on%20Computer%20Vision%20and%20Pattern%20Recognition%20Workshops%20(CVPRW)&rft.au=Kumar,%20Arun%20C.S.&rft.date=2018-06&rft.spage=1170&rft.epage=11708&rft.pages=1170-11708&rft.eissn=2160-7516&rft.coden=IEEPAD&rft_id=info:doi/10.1109/CVPRW.2018.00153&rft.eisbn=9781538661000&rft.eisbn_list=1538661004&rft_dat=%3Cieee_CHZPO%3E8575307%3C/ieee_CHZPO%3E%3Cgrp_id%3Ecdi_FETCH-LOGICAL-i175t-4a92a4a14f90595f7612fc9d9cfb0eed633d5ae331bb85273584bb5379272ebb3%3C/grp_id%3E%3Coa%3E%3C/oa%3E%3Curl%3E%3C/url%3E&rft_id=info:oai/&rft_id=info:pmid/&rft_ieee_id=8575307&rfr_iscdi=true