Loading…

SAMNet: Stereoscopically Attentive Multi-Scale Network for Lightweight Salient Object Detection

Recent progress on salient object detection (SOD) mostly benefits from the explosive development of Convolutional Neural Networks (CNNs). However, much of the improvement comes with the larger network size and heavier computation overhead, which, in our view, is not mobile-friendly and thus difficul...

Full description

Saved in:
Bibliographic Details
Published in:IEEE transactions on image processing 2021, Vol.30, p.3804-3814
Main Authors: Liu, Yun, Zhang, Xin-Yu, Bian, Jia-Wang, Zhang, Le, Cheng, Ming-Ming
Format: Article
Language:English
Subjects:
Citations: Items that this one cites
Items that cite this one
Online Access:Get full text
Tags: Add Tag
No Tags, Be the first to tag this record!
cited_by cdi_FETCH-LOGICAL-c347t-a9ceabb29feaea18c527cb674603816f353e374dbd2b0b6575ac652fe877c0023
cites cdi_FETCH-LOGICAL-c347t-a9ceabb29feaea18c527cb674603816f353e374dbd2b0b6575ac652fe877c0023
container_end_page 3814
container_issue
container_start_page 3804
container_title IEEE transactions on image processing
container_volume 30
creator Liu, Yun
Zhang, Xin-Yu
Bian, Jia-Wang
Zhang, Le
Cheng, Ming-Ming
description Recent progress on salient object detection (SOD) mostly benefits from the explosive development of Convolutional Neural Networks (CNNs). However, much of the improvement comes with the larger network size and heavier computation overhead, which, in our view, is not mobile-friendly and thus difficult to deploy in practice. To promote more practical SOD systems, we introduce a novel Stereoscopically Attentive Multi-scale (SAM) module, which adopts a stereoscopic attention mechanism to adaptively fuse the features of various scales. Embarking on this module, we propose an extremely lightweight network, namely SAMNet, for SOD. Extensive experiments on popular benchmarks demonstrate that the proposed SAMNet yields comparable accuracy with state-of-the-art methods while running at a GPU speed of 343fps and a CPU speed of 5fps for 336 \times 336 inputs with only 1.33M parameters. Therefore, SAMNet paves a new path towards SOD. The source code is available on the project page https://mmcheng.net/SAMNet/ .
doi_str_mv 10.1109/TIP.2021.3065239
format article
fullrecord <record><control><sourceid>proquest_cross</sourceid><recordid>TN_cdi_crossref_primary_10_1109_TIP_2021_3065239</recordid><sourceformat>XML</sourceformat><sourcesystem>PC</sourcesystem><ieee_id>9381668</ieee_id><sourcerecordid>2506604558</sourcerecordid><originalsourceid>FETCH-LOGICAL-c347t-a9ceabb29feaea18c527cb674603816f353e374dbd2b0b6575ac652fe877c0023</originalsourceid><addsrcrecordid>eNpdkEtvEzEQgC0EoqVwR0JClrhw2TB-73KLyqtSSpFSzpbXmQWHTRxsL1X_PV4l9MDFY818M5r5CHnJYMEYdO9ur74tOHC2EKAVF90jcs46yRoAyR_XPyjTGCa7M_Is5y0Ak4rpp-RMCCMUGHNO7Hp5_RXLe7oumDBmHw_Bu3G8p8tScF_CH6TX01hCs65ppJW9i-kXHWKiq_DjZ7nD-aVrN4aK05t-i77QD1hqCHH_nDwZ3JjxxSlekO-fPt5efmlWN5-vLperxgtpSuM6j67veTegQ8dar7jxvTZSg2iZHoQSKIzc9BveQ6-VUc7XiwdsjfEAXFyQt8e5hxR_T5iL3YXscRzdHuOULVcgpAQJqqJv_kO3cUr7ut1MaQ1SqbZScKR8ijknHOwhhZ1L95aBneXbKt_O8u1Jfm15fRo89TvcPDT8s12BV0cgIOJDuZsv1K34CwHdh-g</addsrcrecordid><sourcetype>Aggregation Database</sourcetype><iscdi>true</iscdi><recordtype>article</recordtype><pqid>2506604558</pqid></control><display><type>article</type><title>SAMNet: Stereoscopically Attentive Multi-Scale Network for Lightweight Salient Object Detection</title><source>IEEE Electronic Library (IEL) Journals</source><creator>Liu, Yun ; Zhang, Xin-Yu ; Bian, Jia-Wang ; Zhang, Le ; Cheng, Ming-Ming</creator><creatorcontrib>Liu, Yun ; Zhang, Xin-Yu ; Bian, Jia-Wang ; Zhang, Le ; Cheng, Ming-Ming</creatorcontrib><description>Recent progress on salient object detection (SOD) mostly benefits from the explosive development of Convolutional Neural Networks (CNNs). However, much of the improvement comes with the larger network size and heavier computation overhead, which, in our view, is not mobile-friendly and thus difficult to deploy in practice. To promote more practical SOD systems, we introduce a novel Stereoscopically Attentive Multi-scale (SAM) module, which adopts a stereoscopic attention mechanism to adaptively fuse the features of various scales. Embarking on this module, we propose an extremely lightweight network, namely SAMNet, for SOD. Extensive experiments on popular benchmarks demonstrate that the proposed SAMNet yields comparable accuracy with state-of-the-art methods while running at a GPU speed of 343fps and a CPU speed of 5fps for &lt;inline-formula&gt; &lt;tex-math notation="LaTeX"&gt;336 \times 336 &lt;/tex-math&gt;&lt;/inline-formula&gt; inputs with only 1.33M parameters. Therefore, SAMNet paves a new path towards SOD. The source code is available on the project page https://mmcheng.net/SAMNet/ .</description><identifier>ISSN: 1057-7149</identifier><identifier>EISSN: 1941-0042</identifier><identifier>DOI: 10.1109/TIP.2021.3065239</identifier><identifier>PMID: 33735077</identifier><identifier>CODEN: IIPRE4</identifier><language>eng</language><publisher>United States: IEEE</publisher><subject>Artificial neural networks ; Deep learning ; Explosives detection ; Fuses ; Lightweight ; lightweight saliency detection ; Lightweight salient object detection ; Modules ; multi-scale learning ; Object detection ; Object recognition ; Salience ; Semantics ; Source code ; Stereo image processing ; Task analysis ; Visualization</subject><ispartof>IEEE transactions on image processing, 2021, Vol.30, p.3804-3814</ispartof><rights>Copyright The Institute of Electrical and Electronics Engineers, Inc. (IEEE) 2021</rights><lds50>peer_reviewed</lds50><woscitedreferencessubscribed>false</woscitedreferencessubscribed><citedby>FETCH-LOGICAL-c347t-a9ceabb29feaea18c527cb674603816f353e374dbd2b0b6575ac652fe877c0023</citedby><cites>FETCH-LOGICAL-c347t-a9ceabb29feaea18c527cb674603816f353e374dbd2b0b6575ac652fe877c0023</cites><orcidid>0000-0001-5550-8758 ; 0000-0002-6930-8674 ; 0000-0001-6143-0264 ; 0000-0002-4335-682X ; 0000-0003-2046-3363</orcidid></display><links><openurl>$$Topenurl_article</openurl><openurlfulltext>$$Topenurlfull_article</openurlfulltext><thumbnail>$$Tsyndetics_thumb_exl</thumbnail><linktohtml>$$Uhttps://ieeexplore.ieee.org/document/9381668$$EHTML$$P50$$Gieee$$H</linktohtml><link.rule.ids>314,780,784,4024,27923,27924,27925,54796</link.rule.ids><backlink>$$Uhttps://www.ncbi.nlm.nih.gov/pubmed/33735077$$D View this record in MEDLINE/PubMed$$Hfree_for_read</backlink></links><search><creatorcontrib>Liu, Yun</creatorcontrib><creatorcontrib>Zhang, Xin-Yu</creatorcontrib><creatorcontrib>Bian, Jia-Wang</creatorcontrib><creatorcontrib>Zhang, Le</creatorcontrib><creatorcontrib>Cheng, Ming-Ming</creatorcontrib><title>SAMNet: Stereoscopically Attentive Multi-Scale Network for Lightweight Salient Object Detection</title><title>IEEE transactions on image processing</title><addtitle>TIP</addtitle><addtitle>IEEE Trans Image Process</addtitle><description>Recent progress on salient object detection (SOD) mostly benefits from the explosive development of Convolutional Neural Networks (CNNs). However, much of the improvement comes with the larger network size and heavier computation overhead, which, in our view, is not mobile-friendly and thus difficult to deploy in practice. To promote more practical SOD systems, we introduce a novel Stereoscopically Attentive Multi-scale (SAM) module, which adopts a stereoscopic attention mechanism to adaptively fuse the features of various scales. Embarking on this module, we propose an extremely lightweight network, namely SAMNet, for SOD. Extensive experiments on popular benchmarks demonstrate that the proposed SAMNet yields comparable accuracy with state-of-the-art methods while running at a GPU speed of 343fps and a CPU speed of 5fps for &lt;inline-formula&gt; &lt;tex-math notation="LaTeX"&gt;336 \times 336 &lt;/tex-math&gt;&lt;/inline-formula&gt; inputs with only 1.33M parameters. Therefore, SAMNet paves a new path towards SOD. The source code is available on the project page https://mmcheng.net/SAMNet/ .</description><subject>Artificial neural networks</subject><subject>Deep learning</subject><subject>Explosives detection</subject><subject>Fuses</subject><subject>Lightweight</subject><subject>lightweight saliency detection</subject><subject>Lightweight salient object detection</subject><subject>Modules</subject><subject>multi-scale learning</subject><subject>Object detection</subject><subject>Object recognition</subject><subject>Salience</subject><subject>Semantics</subject><subject>Source code</subject><subject>Stereo image processing</subject><subject>Task analysis</subject><subject>Visualization</subject><issn>1057-7149</issn><issn>1941-0042</issn><fulltext>true</fulltext><rsrctype>article</rsrctype><creationdate>2021</creationdate><recordtype>article</recordtype><recordid>eNpdkEtvEzEQgC0EoqVwR0JClrhw2TB-73KLyqtSSpFSzpbXmQWHTRxsL1X_PV4l9MDFY818M5r5CHnJYMEYdO9ur74tOHC2EKAVF90jcs46yRoAyR_XPyjTGCa7M_Is5y0Ak4rpp-RMCCMUGHNO7Hp5_RXLe7oumDBmHw_Bu3G8p8tScF_CH6TX01hCs65ppJW9i-kXHWKiq_DjZ7nD-aVrN4aK05t-i77QD1hqCHH_nDwZ3JjxxSlekO-fPt5efmlWN5-vLperxgtpSuM6j67veTegQ8dar7jxvTZSg2iZHoQSKIzc9BveQ6-VUc7XiwdsjfEAXFyQt8e5hxR_T5iL3YXscRzdHuOULVcgpAQJqqJv_kO3cUr7ut1MaQ1SqbZScKR8ijknHOwhhZ1L95aBneXbKt_O8u1Jfm15fRo89TvcPDT8s12BV0cgIOJDuZsv1K34CwHdh-g</recordid><startdate>2021</startdate><enddate>2021</enddate><creator>Liu, Yun</creator><creator>Zhang, Xin-Yu</creator><creator>Bian, Jia-Wang</creator><creator>Zhang, Le</creator><creator>Cheng, Ming-Ming</creator><general>IEEE</general><general>The Institute of Electrical and Electronics Engineers, Inc. (IEEE)</general><scope>97E</scope><scope>RIA</scope><scope>RIE</scope><scope>NPM</scope><scope>AAYXX</scope><scope>CITATION</scope><scope>7SC</scope><scope>7SP</scope><scope>8FD</scope><scope>JQ2</scope><scope>L7M</scope><scope>L~C</scope><scope>L~D</scope><scope>7X8</scope><orcidid>https://orcid.org/0000-0001-5550-8758</orcidid><orcidid>https://orcid.org/0000-0002-6930-8674</orcidid><orcidid>https://orcid.org/0000-0001-6143-0264</orcidid><orcidid>https://orcid.org/0000-0002-4335-682X</orcidid><orcidid>https://orcid.org/0000-0003-2046-3363</orcidid></search><sort><creationdate>2021</creationdate><title>SAMNet: Stereoscopically Attentive Multi-Scale Network for Lightweight Salient Object Detection</title><author>Liu, Yun ; Zhang, Xin-Yu ; Bian, Jia-Wang ; Zhang, Le ; Cheng, Ming-Ming</author></sort><facets><frbrtype>5</frbrtype><frbrgroupid>cdi_FETCH-LOGICAL-c347t-a9ceabb29feaea18c527cb674603816f353e374dbd2b0b6575ac652fe877c0023</frbrgroupid><rsrctype>articles</rsrctype><prefilter>articles</prefilter><language>eng</language><creationdate>2021</creationdate><topic>Artificial neural networks</topic><topic>Deep learning</topic><topic>Explosives detection</topic><topic>Fuses</topic><topic>Lightweight</topic><topic>lightweight saliency detection</topic><topic>Lightweight salient object detection</topic><topic>Modules</topic><topic>multi-scale learning</topic><topic>Object detection</topic><topic>Object recognition</topic><topic>Salience</topic><topic>Semantics</topic><topic>Source code</topic><topic>Stereo image processing</topic><topic>Task analysis</topic><topic>Visualization</topic><toplevel>peer_reviewed</toplevel><toplevel>online_resources</toplevel><creatorcontrib>Liu, Yun</creatorcontrib><creatorcontrib>Zhang, Xin-Yu</creatorcontrib><creatorcontrib>Bian, Jia-Wang</creatorcontrib><creatorcontrib>Zhang, Le</creatorcontrib><creatorcontrib>Cheng, Ming-Ming</creatorcontrib><collection>IEEE All-Society Periodicals Package (ASPP) 2005-present</collection><collection>IEEE All-Society Periodicals Package (ASPP) 1998-Present</collection><collection>IEEE Xplore</collection><collection>PubMed</collection><collection>CrossRef</collection><collection>Computer and Information Systems Abstracts</collection><collection>Electronics &amp; Communications Abstracts</collection><collection>Technology Research Database</collection><collection>ProQuest Computer Science Collection</collection><collection>Advanced Technologies Database with Aerospace</collection><collection>Computer and Information Systems Abstracts – Academic</collection><collection>Computer and Information Systems Abstracts Professional</collection><collection>MEDLINE - Academic</collection><jtitle>IEEE transactions on image processing</jtitle></facets><delivery><delcategory>Remote Search Resource</delcategory><fulltext>fulltext</fulltext></delivery><addata><au>Liu, Yun</au><au>Zhang, Xin-Yu</au><au>Bian, Jia-Wang</au><au>Zhang, Le</au><au>Cheng, Ming-Ming</au><format>journal</format><genre>article</genre><ristype>JOUR</ristype><atitle>SAMNet: Stereoscopically Attentive Multi-Scale Network for Lightweight Salient Object Detection</atitle><jtitle>IEEE transactions on image processing</jtitle><stitle>TIP</stitle><addtitle>IEEE Trans Image Process</addtitle><date>2021</date><risdate>2021</risdate><volume>30</volume><spage>3804</spage><epage>3814</epage><pages>3804-3814</pages><issn>1057-7149</issn><eissn>1941-0042</eissn><coden>IIPRE4</coden><abstract>Recent progress on salient object detection (SOD) mostly benefits from the explosive development of Convolutional Neural Networks (CNNs). However, much of the improvement comes with the larger network size and heavier computation overhead, which, in our view, is not mobile-friendly and thus difficult to deploy in practice. To promote more practical SOD systems, we introduce a novel Stereoscopically Attentive Multi-scale (SAM) module, which adopts a stereoscopic attention mechanism to adaptively fuse the features of various scales. Embarking on this module, we propose an extremely lightweight network, namely SAMNet, for SOD. Extensive experiments on popular benchmarks demonstrate that the proposed SAMNet yields comparable accuracy with state-of-the-art methods while running at a GPU speed of 343fps and a CPU speed of 5fps for &lt;inline-formula&gt; &lt;tex-math notation="LaTeX"&gt;336 \times 336 &lt;/tex-math&gt;&lt;/inline-formula&gt; inputs with only 1.33M parameters. Therefore, SAMNet paves a new path towards SOD. The source code is available on the project page https://mmcheng.net/SAMNet/ .</abstract><cop>United States</cop><pub>IEEE</pub><pmid>33735077</pmid><doi>10.1109/TIP.2021.3065239</doi><tpages>11</tpages><orcidid>https://orcid.org/0000-0001-5550-8758</orcidid><orcidid>https://orcid.org/0000-0002-6930-8674</orcidid><orcidid>https://orcid.org/0000-0001-6143-0264</orcidid><orcidid>https://orcid.org/0000-0002-4335-682X</orcidid><orcidid>https://orcid.org/0000-0003-2046-3363</orcidid></addata></record>
fulltext fulltext
identifier ISSN: 1057-7149
ispartof IEEE transactions on image processing, 2021, Vol.30, p.3804-3814
issn 1057-7149
1941-0042
language eng
recordid cdi_crossref_primary_10_1109_TIP_2021_3065239
source IEEE Electronic Library (IEL) Journals
subjects Artificial neural networks
Deep learning
Explosives detection
Fuses
Lightweight
lightweight saliency detection
Lightweight salient object detection
Modules
multi-scale learning
Object detection
Object recognition
Salience
Semantics
Source code
Stereo image processing
Task analysis
Visualization
title SAMNet: Stereoscopically Attentive Multi-Scale Network for Lightweight Salient Object Detection
url http://sfxeu10.hosted.exlibrisgroup.com/loughborough?ctx_ver=Z39.88-2004&ctx_enc=info:ofi/enc:UTF-8&ctx_tim=2025-01-02T10%3A56%3A44IST&url_ver=Z39.88-2004&url_ctx_fmt=infofi/fmt:kev:mtx:ctx&rfr_id=info:sid/primo.exlibrisgroup.com:primo3-Article-proquest_cross&rft_val_fmt=info:ofi/fmt:kev:mtx:journal&rft.genre=article&rft.atitle=SAMNet:%20Stereoscopically%20Attentive%20Multi-Scale%20Network%20for%20Lightweight%20Salient%20Object%20Detection&rft.jtitle=IEEE%20transactions%20on%20image%20processing&rft.au=Liu,%20Yun&rft.date=2021&rft.volume=30&rft.spage=3804&rft.epage=3814&rft.pages=3804-3814&rft.issn=1057-7149&rft.eissn=1941-0042&rft.coden=IIPRE4&rft_id=info:doi/10.1109/TIP.2021.3065239&rft_dat=%3Cproquest_cross%3E2506604558%3C/proquest_cross%3E%3Cgrp_id%3Ecdi_FETCH-LOGICAL-c347t-a9ceabb29feaea18c527cb674603816f353e374dbd2b0b6575ac652fe877c0023%3C/grp_id%3E%3Coa%3E%3C/oa%3E%3Curl%3E%3C/url%3E&rft_id=info:oai/&rft_pqid=2506604558&rft_id=info:pmid/33735077&rft_ieee_id=9381668&rfr_iscdi=true