Loading…
SAMNet: Stereoscopically Attentive Multi-Scale Network for Lightweight Salient Object Detection
Recent progress on salient object detection (SOD) mostly benefits from the explosive development of Convolutional Neural Networks (CNNs). However, much of the improvement comes with the larger network size and heavier computation overhead, which, in our view, is not mobile-friendly and thus difficul...
Saved in:
Published in: | IEEE transactions on image processing 2021, Vol.30, p.3804-3814 |
---|---|
Main Authors: | , , , , |
Format: | Article |
Language: | English |
Subjects: | |
Citations: | Items that this one cites Items that cite this one |
Online Access: | Get full text |
Tags: |
Add Tag
No Tags, Be the first to tag this record!
|
cited_by | cdi_FETCH-LOGICAL-c347t-a9ceabb29feaea18c527cb674603816f353e374dbd2b0b6575ac652fe877c0023 |
---|---|
cites | cdi_FETCH-LOGICAL-c347t-a9ceabb29feaea18c527cb674603816f353e374dbd2b0b6575ac652fe877c0023 |
container_end_page | 3814 |
container_issue | |
container_start_page | 3804 |
container_title | IEEE transactions on image processing |
container_volume | 30 |
creator | Liu, Yun Zhang, Xin-Yu Bian, Jia-Wang Zhang, Le Cheng, Ming-Ming |
description | Recent progress on salient object detection (SOD) mostly benefits from the explosive development of Convolutional Neural Networks (CNNs). However, much of the improvement comes with the larger network size and heavier computation overhead, which, in our view, is not mobile-friendly and thus difficult to deploy in practice. To promote more practical SOD systems, we introduce a novel Stereoscopically Attentive Multi-scale (SAM) module, which adopts a stereoscopic attention mechanism to adaptively fuse the features of various scales. Embarking on this module, we propose an extremely lightweight network, namely SAMNet, for SOD. Extensive experiments on popular benchmarks demonstrate that the proposed SAMNet yields comparable accuracy with state-of-the-art methods while running at a GPU speed of 343fps and a CPU speed of 5fps for 336 \times 336 inputs with only 1.33M parameters. Therefore, SAMNet paves a new path towards SOD. The source code is available on the project page https://mmcheng.net/SAMNet/ . |
doi_str_mv | 10.1109/TIP.2021.3065239 |
format | article |
fullrecord | <record><control><sourceid>proquest_cross</sourceid><recordid>TN_cdi_crossref_primary_10_1109_TIP_2021_3065239</recordid><sourceformat>XML</sourceformat><sourcesystem>PC</sourcesystem><ieee_id>9381668</ieee_id><sourcerecordid>2506604558</sourcerecordid><originalsourceid>FETCH-LOGICAL-c347t-a9ceabb29feaea18c527cb674603816f353e374dbd2b0b6575ac652fe877c0023</originalsourceid><addsrcrecordid>eNpdkEtvEzEQgC0EoqVwR0JClrhw2TB-73KLyqtSSpFSzpbXmQWHTRxsL1X_PV4l9MDFY818M5r5CHnJYMEYdO9ur74tOHC2EKAVF90jcs46yRoAyR_XPyjTGCa7M_Is5y0Ak4rpp-RMCCMUGHNO7Hp5_RXLe7oumDBmHw_Bu3G8p8tScF_CH6TX01hCs65ppJW9i-kXHWKiq_DjZ7nD-aVrN4aK05t-i77QD1hqCHH_nDwZ3JjxxSlekO-fPt5efmlWN5-vLperxgtpSuM6j67veTegQ8dar7jxvTZSg2iZHoQSKIzc9BveQ6-VUc7XiwdsjfEAXFyQt8e5hxR_T5iL3YXscRzdHuOULVcgpAQJqqJv_kO3cUr7ut1MaQ1SqbZScKR8ijknHOwhhZ1L95aBneXbKt_O8u1Jfm15fRo89TvcPDT8s12BV0cgIOJDuZsv1K34CwHdh-g</addsrcrecordid><sourcetype>Aggregation Database</sourcetype><iscdi>true</iscdi><recordtype>article</recordtype><pqid>2506604558</pqid></control><display><type>article</type><title>SAMNet: Stereoscopically Attentive Multi-Scale Network for Lightweight Salient Object Detection</title><source>IEEE Electronic Library (IEL) Journals</source><creator>Liu, Yun ; Zhang, Xin-Yu ; Bian, Jia-Wang ; Zhang, Le ; Cheng, Ming-Ming</creator><creatorcontrib>Liu, Yun ; Zhang, Xin-Yu ; Bian, Jia-Wang ; Zhang, Le ; Cheng, Ming-Ming</creatorcontrib><description>Recent progress on salient object detection (SOD) mostly benefits from the explosive development of Convolutional Neural Networks (CNNs). However, much of the improvement comes with the larger network size and heavier computation overhead, which, in our view, is not mobile-friendly and thus difficult to deploy in practice. To promote more practical SOD systems, we introduce a novel Stereoscopically Attentive Multi-scale (SAM) module, which adopts a stereoscopic attention mechanism to adaptively fuse the features of various scales. Embarking on this module, we propose an extremely lightweight network, namely SAMNet, for SOD. Extensive experiments on popular benchmarks demonstrate that the proposed SAMNet yields comparable accuracy with state-of-the-art methods while running at a GPU speed of 343fps and a CPU speed of 5fps for <inline-formula> <tex-math notation="LaTeX">336 \times 336 </tex-math></inline-formula> inputs with only 1.33M parameters. Therefore, SAMNet paves a new path towards SOD. The source code is available on the project page https://mmcheng.net/SAMNet/ .</description><identifier>ISSN: 1057-7149</identifier><identifier>EISSN: 1941-0042</identifier><identifier>DOI: 10.1109/TIP.2021.3065239</identifier><identifier>PMID: 33735077</identifier><identifier>CODEN: IIPRE4</identifier><language>eng</language><publisher>United States: IEEE</publisher><subject>Artificial neural networks ; Deep learning ; Explosives detection ; Fuses ; Lightweight ; lightweight saliency detection ; Lightweight salient object detection ; Modules ; multi-scale learning ; Object detection ; Object recognition ; Salience ; Semantics ; Source code ; Stereo image processing ; Task analysis ; Visualization</subject><ispartof>IEEE transactions on image processing, 2021, Vol.30, p.3804-3814</ispartof><rights>Copyright The Institute of Electrical and Electronics Engineers, Inc. (IEEE) 2021</rights><lds50>peer_reviewed</lds50><woscitedreferencessubscribed>false</woscitedreferencessubscribed><citedby>FETCH-LOGICAL-c347t-a9ceabb29feaea18c527cb674603816f353e374dbd2b0b6575ac652fe877c0023</citedby><cites>FETCH-LOGICAL-c347t-a9ceabb29feaea18c527cb674603816f353e374dbd2b0b6575ac652fe877c0023</cites><orcidid>0000-0001-5550-8758 ; 0000-0002-6930-8674 ; 0000-0001-6143-0264 ; 0000-0002-4335-682X ; 0000-0003-2046-3363</orcidid></display><links><openurl>$$Topenurl_article</openurl><openurlfulltext>$$Topenurlfull_article</openurlfulltext><thumbnail>$$Tsyndetics_thumb_exl</thumbnail><linktohtml>$$Uhttps://ieeexplore.ieee.org/document/9381668$$EHTML$$P50$$Gieee$$H</linktohtml><link.rule.ids>314,780,784,4024,27923,27924,27925,54796</link.rule.ids><backlink>$$Uhttps://www.ncbi.nlm.nih.gov/pubmed/33735077$$D View this record in MEDLINE/PubMed$$Hfree_for_read</backlink></links><search><creatorcontrib>Liu, Yun</creatorcontrib><creatorcontrib>Zhang, Xin-Yu</creatorcontrib><creatorcontrib>Bian, Jia-Wang</creatorcontrib><creatorcontrib>Zhang, Le</creatorcontrib><creatorcontrib>Cheng, Ming-Ming</creatorcontrib><title>SAMNet: Stereoscopically Attentive Multi-Scale Network for Lightweight Salient Object Detection</title><title>IEEE transactions on image processing</title><addtitle>TIP</addtitle><addtitle>IEEE Trans Image Process</addtitle><description>Recent progress on salient object detection (SOD) mostly benefits from the explosive development of Convolutional Neural Networks (CNNs). However, much of the improvement comes with the larger network size and heavier computation overhead, which, in our view, is not mobile-friendly and thus difficult to deploy in practice. To promote more practical SOD systems, we introduce a novel Stereoscopically Attentive Multi-scale (SAM) module, which adopts a stereoscopic attention mechanism to adaptively fuse the features of various scales. Embarking on this module, we propose an extremely lightweight network, namely SAMNet, for SOD. Extensive experiments on popular benchmarks demonstrate that the proposed SAMNet yields comparable accuracy with state-of-the-art methods while running at a GPU speed of 343fps and a CPU speed of 5fps for <inline-formula> <tex-math notation="LaTeX">336 \times 336 </tex-math></inline-formula> inputs with only 1.33M parameters. Therefore, SAMNet paves a new path towards SOD. The source code is available on the project page https://mmcheng.net/SAMNet/ .</description><subject>Artificial neural networks</subject><subject>Deep learning</subject><subject>Explosives detection</subject><subject>Fuses</subject><subject>Lightweight</subject><subject>lightweight saliency detection</subject><subject>Lightweight salient object detection</subject><subject>Modules</subject><subject>multi-scale learning</subject><subject>Object detection</subject><subject>Object recognition</subject><subject>Salience</subject><subject>Semantics</subject><subject>Source code</subject><subject>Stereo image processing</subject><subject>Task analysis</subject><subject>Visualization</subject><issn>1057-7149</issn><issn>1941-0042</issn><fulltext>true</fulltext><rsrctype>article</rsrctype><creationdate>2021</creationdate><recordtype>article</recordtype><recordid>eNpdkEtvEzEQgC0EoqVwR0JClrhw2TB-73KLyqtSSpFSzpbXmQWHTRxsL1X_PV4l9MDFY818M5r5CHnJYMEYdO9ur74tOHC2EKAVF90jcs46yRoAyR_XPyjTGCa7M_Is5y0Ak4rpp-RMCCMUGHNO7Hp5_RXLe7oumDBmHw_Bu3G8p8tScF_CH6TX01hCs65ppJW9i-kXHWKiq_DjZ7nD-aVrN4aK05t-i77QD1hqCHH_nDwZ3JjxxSlekO-fPt5efmlWN5-vLperxgtpSuM6j67veTegQ8dar7jxvTZSg2iZHoQSKIzc9BveQ6-VUc7XiwdsjfEAXFyQt8e5hxR_T5iL3YXscRzdHuOULVcgpAQJqqJv_kO3cUr7ut1MaQ1SqbZScKR8ijknHOwhhZ1L95aBneXbKt_O8u1Jfm15fRo89TvcPDT8s12BV0cgIOJDuZsv1K34CwHdh-g</recordid><startdate>2021</startdate><enddate>2021</enddate><creator>Liu, Yun</creator><creator>Zhang, Xin-Yu</creator><creator>Bian, Jia-Wang</creator><creator>Zhang, Le</creator><creator>Cheng, Ming-Ming</creator><general>IEEE</general><general>The Institute of Electrical and Electronics Engineers, Inc. (IEEE)</general><scope>97E</scope><scope>RIA</scope><scope>RIE</scope><scope>NPM</scope><scope>AAYXX</scope><scope>CITATION</scope><scope>7SC</scope><scope>7SP</scope><scope>8FD</scope><scope>JQ2</scope><scope>L7M</scope><scope>L~C</scope><scope>L~D</scope><scope>7X8</scope><orcidid>https://orcid.org/0000-0001-5550-8758</orcidid><orcidid>https://orcid.org/0000-0002-6930-8674</orcidid><orcidid>https://orcid.org/0000-0001-6143-0264</orcidid><orcidid>https://orcid.org/0000-0002-4335-682X</orcidid><orcidid>https://orcid.org/0000-0003-2046-3363</orcidid></search><sort><creationdate>2021</creationdate><title>SAMNet: Stereoscopically Attentive Multi-Scale Network for Lightweight Salient Object Detection</title><author>Liu, Yun ; Zhang, Xin-Yu ; Bian, Jia-Wang ; Zhang, Le ; Cheng, Ming-Ming</author></sort><facets><frbrtype>5</frbrtype><frbrgroupid>cdi_FETCH-LOGICAL-c347t-a9ceabb29feaea18c527cb674603816f353e374dbd2b0b6575ac652fe877c0023</frbrgroupid><rsrctype>articles</rsrctype><prefilter>articles</prefilter><language>eng</language><creationdate>2021</creationdate><topic>Artificial neural networks</topic><topic>Deep learning</topic><topic>Explosives detection</topic><topic>Fuses</topic><topic>Lightweight</topic><topic>lightweight saliency detection</topic><topic>Lightweight salient object detection</topic><topic>Modules</topic><topic>multi-scale learning</topic><topic>Object detection</topic><topic>Object recognition</topic><topic>Salience</topic><topic>Semantics</topic><topic>Source code</topic><topic>Stereo image processing</topic><topic>Task analysis</topic><topic>Visualization</topic><toplevel>peer_reviewed</toplevel><toplevel>online_resources</toplevel><creatorcontrib>Liu, Yun</creatorcontrib><creatorcontrib>Zhang, Xin-Yu</creatorcontrib><creatorcontrib>Bian, Jia-Wang</creatorcontrib><creatorcontrib>Zhang, Le</creatorcontrib><creatorcontrib>Cheng, Ming-Ming</creatorcontrib><collection>IEEE All-Society Periodicals Package (ASPP) 2005-present</collection><collection>IEEE All-Society Periodicals Package (ASPP) 1998-Present</collection><collection>IEEE Xplore</collection><collection>PubMed</collection><collection>CrossRef</collection><collection>Computer and Information Systems Abstracts</collection><collection>Electronics & Communications Abstracts</collection><collection>Technology Research Database</collection><collection>ProQuest Computer Science Collection</collection><collection>Advanced Technologies Database with Aerospace</collection><collection>Computer and Information Systems Abstracts – Academic</collection><collection>Computer and Information Systems Abstracts Professional</collection><collection>MEDLINE - Academic</collection><jtitle>IEEE transactions on image processing</jtitle></facets><delivery><delcategory>Remote Search Resource</delcategory><fulltext>fulltext</fulltext></delivery><addata><au>Liu, Yun</au><au>Zhang, Xin-Yu</au><au>Bian, Jia-Wang</au><au>Zhang, Le</au><au>Cheng, Ming-Ming</au><format>journal</format><genre>article</genre><ristype>JOUR</ristype><atitle>SAMNet: Stereoscopically Attentive Multi-Scale Network for Lightweight Salient Object Detection</atitle><jtitle>IEEE transactions on image processing</jtitle><stitle>TIP</stitle><addtitle>IEEE Trans Image Process</addtitle><date>2021</date><risdate>2021</risdate><volume>30</volume><spage>3804</spage><epage>3814</epage><pages>3804-3814</pages><issn>1057-7149</issn><eissn>1941-0042</eissn><coden>IIPRE4</coden><abstract>Recent progress on salient object detection (SOD) mostly benefits from the explosive development of Convolutional Neural Networks (CNNs). However, much of the improvement comes with the larger network size and heavier computation overhead, which, in our view, is not mobile-friendly and thus difficult to deploy in practice. To promote more practical SOD systems, we introduce a novel Stereoscopically Attentive Multi-scale (SAM) module, which adopts a stereoscopic attention mechanism to adaptively fuse the features of various scales. Embarking on this module, we propose an extremely lightweight network, namely SAMNet, for SOD. Extensive experiments on popular benchmarks demonstrate that the proposed SAMNet yields comparable accuracy with state-of-the-art methods while running at a GPU speed of 343fps and a CPU speed of 5fps for <inline-formula> <tex-math notation="LaTeX">336 \times 336 </tex-math></inline-formula> inputs with only 1.33M parameters. Therefore, SAMNet paves a new path towards SOD. The source code is available on the project page https://mmcheng.net/SAMNet/ .</abstract><cop>United States</cop><pub>IEEE</pub><pmid>33735077</pmid><doi>10.1109/TIP.2021.3065239</doi><tpages>11</tpages><orcidid>https://orcid.org/0000-0001-5550-8758</orcidid><orcidid>https://orcid.org/0000-0002-6930-8674</orcidid><orcidid>https://orcid.org/0000-0001-6143-0264</orcidid><orcidid>https://orcid.org/0000-0002-4335-682X</orcidid><orcidid>https://orcid.org/0000-0003-2046-3363</orcidid></addata></record> |
fulltext | fulltext |
identifier | ISSN: 1057-7149 |
ispartof | IEEE transactions on image processing, 2021, Vol.30, p.3804-3814 |
issn | 1057-7149 1941-0042 |
language | eng |
recordid | cdi_crossref_primary_10_1109_TIP_2021_3065239 |
source | IEEE Electronic Library (IEL) Journals |
subjects | Artificial neural networks Deep learning Explosives detection Fuses Lightweight lightweight saliency detection Lightweight salient object detection Modules multi-scale learning Object detection Object recognition Salience Semantics Source code Stereo image processing Task analysis Visualization |
title | SAMNet: Stereoscopically Attentive Multi-Scale Network for Lightweight Salient Object Detection |
url | http://sfxeu10.hosted.exlibrisgroup.com/loughborough?ctx_ver=Z39.88-2004&ctx_enc=info:ofi/enc:UTF-8&ctx_tim=2025-01-02T10%3A56%3A44IST&url_ver=Z39.88-2004&url_ctx_fmt=infofi/fmt:kev:mtx:ctx&rfr_id=info:sid/primo.exlibrisgroup.com:primo3-Article-proquest_cross&rft_val_fmt=info:ofi/fmt:kev:mtx:journal&rft.genre=article&rft.atitle=SAMNet:%20Stereoscopically%20Attentive%20Multi-Scale%20Network%20for%20Lightweight%20Salient%20Object%20Detection&rft.jtitle=IEEE%20transactions%20on%20image%20processing&rft.au=Liu,%20Yun&rft.date=2021&rft.volume=30&rft.spage=3804&rft.epage=3814&rft.pages=3804-3814&rft.issn=1057-7149&rft.eissn=1941-0042&rft.coden=IIPRE4&rft_id=info:doi/10.1109/TIP.2021.3065239&rft_dat=%3Cproquest_cross%3E2506604558%3C/proquest_cross%3E%3Cgrp_id%3Ecdi_FETCH-LOGICAL-c347t-a9ceabb29feaea18c527cb674603816f353e374dbd2b0b6575ac652fe877c0023%3C/grp_id%3E%3Coa%3E%3C/oa%3E%3Curl%3E%3C/url%3E&rft_id=info:oai/&rft_pqid=2506604558&rft_id=info:pmid/33735077&rft_ieee_id=9381668&rfr_iscdi=true |