Loading…
Self-Supervised Monocular Depth Estimation via Binocular Geometric Correlation Learning
Monocular depth estimation aims to infer a depth map from a single image. Although supervised learning-based methods have achieved remarkable performance, they generally rely on a large amount of labor-intensively annotated data. Self-supervised methods, on the other hand, do not require any annotat...
Saved in:
Published in: | ACM transactions on multimedia computing communications and applications 2024-08, Vol.20 (8), p.1-19, Article 250 |
---|---|
Main Authors: | , , , , , , |
Format: | Article |
Language: | English |
Subjects: | |
Citations: | Items that this one cites Items that cite this one |
Online Access: | Get full text |
Tags: |
Add Tag
No Tags, Be the first to tag this record!
|
cited_by | cdi_FETCH-LOGICAL-a174t-34a10b91c9ce8ff31828a1278960487362660f89678817055dcb8784287fc35c3 |
---|---|
cites | cdi_FETCH-LOGICAL-a174t-34a10b91c9ce8ff31828a1278960487362660f89678817055dcb8784287fc35c3 |
container_end_page | 19 |
container_issue | 8 |
container_start_page | 1 |
container_title | ACM transactions on multimedia computing communications and applications |
container_volume | 20 |
creator | Peng, Bo Sun, Lin Lei, Jianjun Liu, Bingzheng Shen, Haifeng Li, Wanqing Huang, Qingming |
description | Monocular depth estimation aims to infer a depth map from a single image. Although supervised learning-based methods have achieved remarkable performance, they generally rely on a large amount of labor-intensively annotated data. Self-supervised methods, on the other hand, do not require any annotation of ground-truth depth and have recently attracted increasing attention. In this work, we propose a self-supervised monocular depth estimation network via binocular geometric correlation learning. Specifically, considering the inter-view geometric correlation, a binocular cue prediction module is presented to generate the auxiliary vision cue for the self-supervised learning of monocular depth estimation. Then, to deal with the occlusion in depth estimation, an occlusion interference attenuated constraint is developed to guide the supervision of the network by inferring the occlusion region and producing paired occlusion masks. Experimental results on two popular benchmark datasets have demonstrated that the proposed network obtains competitive results compared to state-of-the-art self-supervised methods and achieves comparable results to some popular supervised methods. |
doi_str_mv | 10.1145/3663570 |
format | article |
fullrecord | <record><control><sourceid>acm_cross</sourceid><recordid>TN_cdi_crossref_primary_10_1145_3663570</recordid><sourceformat>XML</sourceformat><sourcesystem>PC</sourcesystem><sourcerecordid>3663570</sourcerecordid><originalsourceid>FETCH-LOGICAL-a174t-34a10b91c9ce8ff31828a1278960487362660f89678817055dcb8784287fc35c3</originalsourceid><addsrcrecordid>eNo9kL1PwzAUxC0EEqUgdiZvTAG_-DMjlFKQghgKYoxc1wajJI7stFL_e4LSdnr3dD-dTofQNZA7AMbvqRCUS3KCJsA5ZEIJfnrUXJ6ji5R-CaGCMzFBX0tbu2y56Wzc-mTX-C20wWxqHfGT7fofPE-9b3TvQ4u3XuNHf7AXNjS2j97gWYjR1iNTWh1b335fojOn62Sv9neKPp_nH7OXrHxfvM4eykyDZH1GmQayKsAUxirnKKhcacilKgRhSlKRC0Hc8EmlQBLO12alpGK5ks5QbugU3Y65JoaUonVVF4e-cVcBqf73qPZ7DOTNSGrTHKGD-QflBlm1</addsrcrecordid><sourcetype>Aggregation Database</sourcetype><iscdi>true</iscdi><recordtype>article</recordtype></control><display><type>article</type><title>Self-Supervised Monocular Depth Estimation via Binocular Geometric Correlation Learning</title><source>Association for Computing Machinery:Jisc Collections:ACM OPEN Journals 2023-2025 (reading list)</source><creator>Peng, Bo ; Sun, Lin ; Lei, Jianjun ; Liu, Bingzheng ; Shen, Haifeng ; Li, Wanqing ; Huang, Qingming</creator><creatorcontrib>Peng, Bo ; Sun, Lin ; Lei, Jianjun ; Liu, Bingzheng ; Shen, Haifeng ; Li, Wanqing ; Huang, Qingming</creatorcontrib><description>Monocular depth estimation aims to infer a depth map from a single image. Although supervised learning-based methods have achieved remarkable performance, they generally rely on a large amount of labor-intensively annotated data. Self-supervised methods, on the other hand, do not require any annotation of ground-truth depth and have recently attracted increasing attention. In this work, we propose a self-supervised monocular depth estimation network via binocular geometric correlation learning. Specifically, considering the inter-view geometric correlation, a binocular cue prediction module is presented to generate the auxiliary vision cue for the self-supervised learning of monocular depth estimation. Then, to deal with the occlusion in depth estimation, an occlusion interference attenuated constraint is developed to guide the supervision of the network by inferring the occlusion region and producing paired occlusion masks. Experimental results on two popular benchmark datasets have demonstrated that the proposed network obtains competitive results compared to state-of-the-art self-supervised methods and achieves comparable results to some popular supervised methods.</description><identifier>ISSN: 1551-6857</identifier><identifier>EISSN: 1551-6865</identifier><identifier>DOI: 10.1145/3663570</identifier><language>eng</language><publisher>New York, NY: ACM</publisher><subject>Information systems ; Multimedia information systems</subject><ispartof>ACM transactions on multimedia computing communications and applications, 2024-08, Vol.20 (8), p.1-19, Article 250</ispartof><rights>Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than the author(s) must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from</rights><lds50>peer_reviewed</lds50><woscitedreferencessubscribed>false</woscitedreferencessubscribed><citedby>FETCH-LOGICAL-a174t-34a10b91c9ce8ff31828a1278960487362660f89678817055dcb8784287fc35c3</citedby><cites>FETCH-LOGICAL-a174t-34a10b91c9ce8ff31828a1278960487362660f89678817055dcb8784287fc35c3</cites><orcidid>0000-0003-3171-7680 ; 0000-0002-6949-4147 ; 0000-0002-2654-3084 ; 0000-0001-7542-296X ; 0000-0002-6616-453X ; 0009-0002-9794-4915 ; 0000-0002-4427-2687</orcidid></display><links><openurl>$$Topenurl_article</openurl><openurlfulltext>$$Topenurlfull_article</openurlfulltext><thumbnail>$$Tsyndetics_thumb_exl</thumbnail><link.rule.ids>314,780,784,27924,27925</link.rule.ids></links><search><creatorcontrib>Peng, Bo</creatorcontrib><creatorcontrib>Sun, Lin</creatorcontrib><creatorcontrib>Lei, Jianjun</creatorcontrib><creatorcontrib>Liu, Bingzheng</creatorcontrib><creatorcontrib>Shen, Haifeng</creatorcontrib><creatorcontrib>Li, Wanqing</creatorcontrib><creatorcontrib>Huang, Qingming</creatorcontrib><title>Self-Supervised Monocular Depth Estimation via Binocular Geometric Correlation Learning</title><title>ACM transactions on multimedia computing communications and applications</title><addtitle>ACM TOMM</addtitle><description>Monocular depth estimation aims to infer a depth map from a single image. Although supervised learning-based methods have achieved remarkable performance, they generally rely on a large amount of labor-intensively annotated data. Self-supervised methods, on the other hand, do not require any annotation of ground-truth depth and have recently attracted increasing attention. In this work, we propose a self-supervised monocular depth estimation network via binocular geometric correlation learning. Specifically, considering the inter-view geometric correlation, a binocular cue prediction module is presented to generate the auxiliary vision cue for the self-supervised learning of monocular depth estimation. Then, to deal with the occlusion in depth estimation, an occlusion interference attenuated constraint is developed to guide the supervision of the network by inferring the occlusion region and producing paired occlusion masks. Experimental results on two popular benchmark datasets have demonstrated that the proposed network obtains competitive results compared to state-of-the-art self-supervised methods and achieves comparable results to some popular supervised methods.</description><subject>Information systems</subject><subject>Multimedia information systems</subject><issn>1551-6857</issn><issn>1551-6865</issn><fulltext>true</fulltext><rsrctype>article</rsrctype><creationdate>2024</creationdate><recordtype>article</recordtype><recordid>eNo9kL1PwzAUxC0EEqUgdiZvTAG_-DMjlFKQghgKYoxc1wajJI7stFL_e4LSdnr3dD-dTofQNZA7AMbvqRCUS3KCJsA5ZEIJfnrUXJ6ji5R-CaGCMzFBX0tbu2y56Wzc-mTX-C20wWxqHfGT7fofPE-9b3TvQ4u3XuNHf7AXNjS2j97gWYjR1iNTWh1b335fojOn62Sv9neKPp_nH7OXrHxfvM4eykyDZH1GmQayKsAUxirnKKhcacilKgRhSlKRC0Hc8EmlQBLO12alpGK5ks5QbugU3Y65JoaUonVVF4e-cVcBqf73qPZ7DOTNSGrTHKGD-QflBlm1</recordid><startdate>20240831</startdate><enddate>20240831</enddate><creator>Peng, Bo</creator><creator>Sun, Lin</creator><creator>Lei, Jianjun</creator><creator>Liu, Bingzheng</creator><creator>Shen, Haifeng</creator><creator>Li, Wanqing</creator><creator>Huang, Qingming</creator><general>ACM</general><scope>AAYXX</scope><scope>CITATION</scope><orcidid>https://orcid.org/0000-0003-3171-7680</orcidid><orcidid>https://orcid.org/0000-0002-6949-4147</orcidid><orcidid>https://orcid.org/0000-0002-2654-3084</orcidid><orcidid>https://orcid.org/0000-0001-7542-296X</orcidid><orcidid>https://orcid.org/0000-0002-6616-453X</orcidid><orcidid>https://orcid.org/0009-0002-9794-4915</orcidid><orcidid>https://orcid.org/0000-0002-4427-2687</orcidid></search><sort><creationdate>20240831</creationdate><title>Self-Supervised Monocular Depth Estimation via Binocular Geometric Correlation Learning</title><author>Peng, Bo ; Sun, Lin ; Lei, Jianjun ; Liu, Bingzheng ; Shen, Haifeng ; Li, Wanqing ; Huang, Qingming</author></sort><facets><frbrtype>5</frbrtype><frbrgroupid>cdi_FETCH-LOGICAL-a174t-34a10b91c9ce8ff31828a1278960487362660f89678817055dcb8784287fc35c3</frbrgroupid><rsrctype>articles</rsrctype><prefilter>articles</prefilter><language>eng</language><creationdate>2024</creationdate><topic>Information systems</topic><topic>Multimedia information systems</topic><toplevel>peer_reviewed</toplevel><toplevel>online_resources</toplevel><creatorcontrib>Peng, Bo</creatorcontrib><creatorcontrib>Sun, Lin</creatorcontrib><creatorcontrib>Lei, Jianjun</creatorcontrib><creatorcontrib>Liu, Bingzheng</creatorcontrib><creatorcontrib>Shen, Haifeng</creatorcontrib><creatorcontrib>Li, Wanqing</creatorcontrib><creatorcontrib>Huang, Qingming</creatorcontrib><collection>CrossRef</collection><jtitle>ACM transactions on multimedia computing communications and applications</jtitle></facets><delivery><delcategory>Remote Search Resource</delcategory><fulltext>fulltext</fulltext></delivery><addata><au>Peng, Bo</au><au>Sun, Lin</au><au>Lei, Jianjun</au><au>Liu, Bingzheng</au><au>Shen, Haifeng</au><au>Li, Wanqing</au><au>Huang, Qingming</au><format>journal</format><genre>article</genre><ristype>JOUR</ristype><atitle>Self-Supervised Monocular Depth Estimation via Binocular Geometric Correlation Learning</atitle><jtitle>ACM transactions on multimedia computing communications and applications</jtitle><stitle>ACM TOMM</stitle><date>2024-08-31</date><risdate>2024</risdate><volume>20</volume><issue>8</issue><spage>1</spage><epage>19</epage><pages>1-19</pages><artnum>250</artnum><issn>1551-6857</issn><eissn>1551-6865</eissn><abstract>Monocular depth estimation aims to infer a depth map from a single image. Although supervised learning-based methods have achieved remarkable performance, they generally rely on a large amount of labor-intensively annotated data. Self-supervised methods, on the other hand, do not require any annotation of ground-truth depth and have recently attracted increasing attention. In this work, we propose a self-supervised monocular depth estimation network via binocular geometric correlation learning. Specifically, considering the inter-view geometric correlation, a binocular cue prediction module is presented to generate the auxiliary vision cue for the self-supervised learning of monocular depth estimation. Then, to deal with the occlusion in depth estimation, an occlusion interference attenuated constraint is developed to guide the supervision of the network by inferring the occlusion region and producing paired occlusion masks. Experimental results on two popular benchmark datasets have demonstrated that the proposed network obtains competitive results compared to state-of-the-art self-supervised methods and achieves comparable results to some popular supervised methods.</abstract><cop>New York, NY</cop><pub>ACM</pub><doi>10.1145/3663570</doi><tpages>19</tpages><orcidid>https://orcid.org/0000-0003-3171-7680</orcidid><orcidid>https://orcid.org/0000-0002-6949-4147</orcidid><orcidid>https://orcid.org/0000-0002-2654-3084</orcidid><orcidid>https://orcid.org/0000-0001-7542-296X</orcidid><orcidid>https://orcid.org/0000-0002-6616-453X</orcidid><orcidid>https://orcid.org/0009-0002-9794-4915</orcidid><orcidid>https://orcid.org/0000-0002-4427-2687</orcidid></addata></record> |
fulltext | fulltext |
identifier | ISSN: 1551-6857 |
ispartof | ACM transactions on multimedia computing communications and applications, 2024-08, Vol.20 (8), p.1-19, Article 250 |
issn | 1551-6857 1551-6865 |
language | eng |
recordid | cdi_crossref_primary_10_1145_3663570 |
source | Association for Computing Machinery:Jisc Collections:ACM OPEN Journals 2023-2025 (reading list) |
subjects | Information systems Multimedia information systems |
title | Self-Supervised Monocular Depth Estimation via Binocular Geometric Correlation Learning |
url | http://sfxeu10.hosted.exlibrisgroup.com/loughborough?ctx_ver=Z39.88-2004&ctx_enc=info:ofi/enc:UTF-8&ctx_tim=2025-01-02T17%3A09%3A58IST&url_ver=Z39.88-2004&url_ctx_fmt=infofi/fmt:kev:mtx:ctx&rfr_id=info:sid/primo.exlibrisgroup.com:primo3-Article-acm_cross&rft_val_fmt=info:ofi/fmt:kev:mtx:journal&rft.genre=article&rft.atitle=Self-Supervised%20Monocular%20Depth%20Estimation%20via%20Binocular%20Geometric%20Correlation%20Learning&rft.jtitle=ACM%20transactions%20on%20multimedia%20computing%20communications%20and%20applications&rft.au=Peng,%20Bo&rft.date=2024-08-31&rft.volume=20&rft.issue=8&rft.spage=1&rft.epage=19&rft.pages=1-19&rft.artnum=250&rft.issn=1551-6857&rft.eissn=1551-6865&rft_id=info:doi/10.1145/3663570&rft_dat=%3Cacm_cross%3E3663570%3C/acm_cross%3E%3Cgrp_id%3Ecdi_FETCH-LOGICAL-a174t-34a10b91c9ce8ff31828a1278960487362660f89678817055dcb8784287fc35c3%3C/grp_id%3E%3Coa%3E%3C/oa%3E%3Curl%3E%3C/url%3E&rft_id=info:oai/&rft_id=info:pmid/&rfr_iscdi=true |