Loading…

SAM2-UNet: Segment Anything 2 Makes Strong Encoder for Natural and Medical Image Segmentation

Image segmentation plays an important role in vision understanding. Recently, the emerging vision foundation models continuously achieved superior performance on various tasks. Following such success, in this paper, we prove that the Segment Anything Model 2 (SAM2) can be a strong encoder for U-shap...

Full description

Saved in:
Bibliographic Details
Published in:arXiv.org 2024-08
Main Authors: Xiong, Xinyu, Wu, Zihuang, Tan, Shuangyi, Li, Wenxue, Tang, Feilong, Chen, Ying, Li, Siying, Ma, Jie, Li, Guanbin
Format: Article
Language:English
Subjects:
Online Access:Get full text
Tags: Add Tag
No Tags, Be the first to tag this record!
cited_by
cites
container_end_page
container_issue
container_start_page
container_title arXiv.org
container_volume
creator Xiong, Xinyu
Wu, Zihuang
Tan, Shuangyi
Li, Wenxue
Tang, Feilong
Chen, Ying
Li, Siying
Ma, Jie
Li, Guanbin
description Image segmentation plays an important role in vision understanding. Recently, the emerging vision foundation models continuously achieved superior performance on various tasks. Following such success, in this paper, we prove that the Segment Anything Model 2 (SAM2) can be a strong encoder for U-shaped segmentation models. We propose a simple but effective framework, termed SAM2-UNet, for versatile image segmentation. Specifically, SAM2-UNet adopts the Hiera backbone of SAM2 as the encoder, while the decoder uses the classic U-shaped design. Additionally, adapters are inserted into the encoder to allow parameter-efficient fine-tuning. Preliminary experiments on various downstream tasks, such as camouflaged object detection, salient object detection, marine animal segmentation, mirror detection, and polyp segmentation, demonstrate that our SAM2-UNet can simply beat existing specialized state-of-the-art methods without bells and whistles. Project page: \url{https://github.com/WZH0120/SAM2-UNet}.
format article
fullrecord <record><control><sourceid>proquest</sourceid><recordid>TN_cdi_proquest_journals_3094565596</recordid><sourceformat>XML</sourceformat><sourcesystem>PC</sourcesystem><sourcerecordid>3094565596</sourcerecordid><originalsourceid>FETCH-proquest_journals_30945655963</originalsourceid><addsrcrecordid>eNqNjLEOgjAURRsTE4nyDy9xJsGWorgRg9EBFnQ0pIEHgtBqWwb_Xgbdne49uSd3RhzK2MbbBZQuiGtM5_s-DbeUc-aQWx6n1LtmaPeQYzOgtBDLt723sgEKqXiggdxqNWEiS1WhhlppyIQdtehByApSrNpy6udBNPh7EbZVckXmtegNut9ckvUxuRxO3lOr14jGFp0atZymgvlRwEPOo5D9Z30A3jBCBg</addsrcrecordid><sourcetype>Aggregation Database</sourcetype><iscdi>true</iscdi><recordtype>article</recordtype><pqid>3094565596</pqid></control><display><type>article</type><title>SAM2-UNet: Segment Anything 2 Makes Strong Encoder for Natural and Medical Image Segmentation</title><source>Publicly Available Content Database</source><creator>Xiong, Xinyu ; Wu, Zihuang ; Tan, Shuangyi ; Li, Wenxue ; Tang, Feilong ; Chen, Ying ; Li, Siying ; Ma, Jie ; Li, Guanbin</creator><creatorcontrib>Xiong, Xinyu ; Wu, Zihuang ; Tan, Shuangyi ; Li, Wenxue ; Tang, Feilong ; Chen, Ying ; Li, Siying ; Ma, Jie ; Li, Guanbin</creatorcontrib><description>Image segmentation plays an important role in vision understanding. Recently, the emerging vision foundation models continuously achieved superior performance on various tasks. Following such success, in this paper, we prove that the Segment Anything Model 2 (SAM2) can be a strong encoder for U-shaped segmentation models. We propose a simple but effective framework, termed SAM2-UNet, for versatile image segmentation. Specifically, SAM2-UNet adopts the Hiera backbone of SAM2 as the encoder, while the decoder uses the classic U-shaped design. Additionally, adapters are inserted into the encoder to allow parameter-efficient fine-tuning. Preliminary experiments on various downstream tasks, such as camouflaged object detection, salient object detection, marine animal segmentation, mirror detection, and polyp segmentation, demonstrate that our SAM2-UNet can simply beat existing specialized state-of-the-art methods without bells and whistles. Project page: \url{https://github.com/WZH0120/SAM2-UNet}.</description><identifier>EISSN: 2331-8422</identifier><language>eng</language><publisher>Ithaca: Cornell University Library, arXiv.org</publisher><subject>Coders ; Design parameters ; Image segmentation ; Marine animals ; Medical imaging ; Object recognition ; Salience</subject><ispartof>arXiv.org, 2024-08</ispartof><rights>2024. This work is published under http://arxiv.org/licenses/nonexclusive-distrib/1.0/ (the “License”). Notwithstanding the ProQuest Terms and Conditions, you may use this content in accordance with the terms of the License.</rights><oa>free_for_read</oa><woscitedreferencessubscribed>false</woscitedreferencessubscribed></display><links><openurl>$$Topenurl_article</openurl><openurlfulltext>$$Topenurlfull_article</openurlfulltext><thumbnail>$$Tsyndetics_thumb_exl</thumbnail><linktohtml>$$Uhttps://www.proquest.com/docview/3094565596?pq-origsite=primo$$EHTML$$P50$$Gproquest$$Hfree_for_read</linktohtml><link.rule.ids>780,784,25753,37012,44590</link.rule.ids></links><search><creatorcontrib>Xiong, Xinyu</creatorcontrib><creatorcontrib>Wu, Zihuang</creatorcontrib><creatorcontrib>Tan, Shuangyi</creatorcontrib><creatorcontrib>Li, Wenxue</creatorcontrib><creatorcontrib>Tang, Feilong</creatorcontrib><creatorcontrib>Chen, Ying</creatorcontrib><creatorcontrib>Li, Siying</creatorcontrib><creatorcontrib>Ma, Jie</creatorcontrib><creatorcontrib>Li, Guanbin</creatorcontrib><title>SAM2-UNet: Segment Anything 2 Makes Strong Encoder for Natural and Medical Image Segmentation</title><title>arXiv.org</title><description>Image segmentation plays an important role in vision understanding. Recently, the emerging vision foundation models continuously achieved superior performance on various tasks. Following such success, in this paper, we prove that the Segment Anything Model 2 (SAM2) can be a strong encoder for U-shaped segmentation models. We propose a simple but effective framework, termed SAM2-UNet, for versatile image segmentation. Specifically, SAM2-UNet adopts the Hiera backbone of SAM2 as the encoder, while the decoder uses the classic U-shaped design. Additionally, adapters are inserted into the encoder to allow parameter-efficient fine-tuning. Preliminary experiments on various downstream tasks, such as camouflaged object detection, salient object detection, marine animal segmentation, mirror detection, and polyp segmentation, demonstrate that our SAM2-UNet can simply beat existing specialized state-of-the-art methods without bells and whistles. Project page: \url{https://github.com/WZH0120/SAM2-UNet}.</description><subject>Coders</subject><subject>Design parameters</subject><subject>Image segmentation</subject><subject>Marine animals</subject><subject>Medical imaging</subject><subject>Object recognition</subject><subject>Salience</subject><issn>2331-8422</issn><fulltext>true</fulltext><rsrctype>article</rsrctype><creationdate>2024</creationdate><recordtype>article</recordtype><sourceid>PIMPY</sourceid><recordid>eNqNjLEOgjAURRsTE4nyDy9xJsGWorgRg9EBFnQ0pIEHgtBqWwb_Xgbdne49uSd3RhzK2MbbBZQuiGtM5_s-DbeUc-aQWx6n1LtmaPeQYzOgtBDLt723sgEKqXiggdxqNWEiS1WhhlppyIQdtehByApSrNpy6udBNPh7EbZVckXmtegNut9ckvUxuRxO3lOr14jGFp0atZymgvlRwEPOo5D9Z30A3jBCBg</recordid><startdate>20240816</startdate><enddate>20240816</enddate><creator>Xiong, Xinyu</creator><creator>Wu, Zihuang</creator><creator>Tan, Shuangyi</creator><creator>Li, Wenxue</creator><creator>Tang, Feilong</creator><creator>Chen, Ying</creator><creator>Li, Siying</creator><creator>Ma, Jie</creator><creator>Li, Guanbin</creator><general>Cornell University Library, arXiv.org</general><scope>8FE</scope><scope>8FG</scope><scope>ABJCF</scope><scope>ABUWG</scope><scope>AFKRA</scope><scope>AZQEC</scope><scope>BENPR</scope><scope>BGLVJ</scope><scope>CCPQU</scope><scope>DWQXO</scope><scope>HCIFZ</scope><scope>L6V</scope><scope>M7S</scope><scope>PIMPY</scope><scope>PQEST</scope><scope>PQQKQ</scope><scope>PQUKI</scope><scope>PRINS</scope><scope>PTHSS</scope></search><sort><creationdate>20240816</creationdate><title>SAM2-UNet: Segment Anything 2 Makes Strong Encoder for Natural and Medical Image Segmentation</title><author>Xiong, Xinyu ; Wu, Zihuang ; Tan, Shuangyi ; Li, Wenxue ; Tang, Feilong ; Chen, Ying ; Li, Siying ; Ma, Jie ; Li, Guanbin</author></sort><facets><frbrtype>5</frbrtype><frbrgroupid>cdi_FETCH-proquest_journals_30945655963</frbrgroupid><rsrctype>articles</rsrctype><prefilter>articles</prefilter><language>eng</language><creationdate>2024</creationdate><topic>Coders</topic><topic>Design parameters</topic><topic>Image segmentation</topic><topic>Marine animals</topic><topic>Medical imaging</topic><topic>Object recognition</topic><topic>Salience</topic><toplevel>online_resources</toplevel><creatorcontrib>Xiong, Xinyu</creatorcontrib><creatorcontrib>Wu, Zihuang</creatorcontrib><creatorcontrib>Tan, Shuangyi</creatorcontrib><creatorcontrib>Li, Wenxue</creatorcontrib><creatorcontrib>Tang, Feilong</creatorcontrib><creatorcontrib>Chen, Ying</creatorcontrib><creatorcontrib>Li, Siying</creatorcontrib><creatorcontrib>Ma, Jie</creatorcontrib><creatorcontrib>Li, Guanbin</creatorcontrib><collection>ProQuest SciTech Collection</collection><collection>ProQuest Technology Collection</collection><collection>Materials Science &amp; Engineering Collection</collection><collection>ProQuest Central (Alumni)</collection><collection>ProQuest Central</collection><collection>ProQuest Central Essentials</collection><collection>AUTh Library subscriptions: ProQuest Central</collection><collection>Technology Collection</collection><collection>ProQuest One Community College</collection><collection>ProQuest Central Korea</collection><collection>SciTech Premium Collection</collection><collection>ProQuest Engineering Collection</collection><collection>ProQuest Engineering Database</collection><collection>Publicly Available Content Database</collection><collection>ProQuest One Academic Eastern Edition (DO NOT USE)</collection><collection>ProQuest One Academic</collection><collection>ProQuest One Academic UKI Edition</collection><collection>ProQuest Central China</collection><collection>Engineering collection</collection></facets><delivery><delcategory>Remote Search Resource</delcategory><fulltext>fulltext</fulltext></delivery><addata><au>Xiong, Xinyu</au><au>Wu, Zihuang</au><au>Tan, Shuangyi</au><au>Li, Wenxue</au><au>Tang, Feilong</au><au>Chen, Ying</au><au>Li, Siying</au><au>Ma, Jie</au><au>Li, Guanbin</au><format>book</format><genre>document</genre><ristype>GEN</ristype><atitle>SAM2-UNet: Segment Anything 2 Makes Strong Encoder for Natural and Medical Image Segmentation</atitle><jtitle>arXiv.org</jtitle><date>2024-08-16</date><risdate>2024</risdate><eissn>2331-8422</eissn><abstract>Image segmentation plays an important role in vision understanding. Recently, the emerging vision foundation models continuously achieved superior performance on various tasks. Following such success, in this paper, we prove that the Segment Anything Model 2 (SAM2) can be a strong encoder for U-shaped segmentation models. We propose a simple but effective framework, termed SAM2-UNet, for versatile image segmentation. Specifically, SAM2-UNet adopts the Hiera backbone of SAM2 as the encoder, while the decoder uses the classic U-shaped design. Additionally, adapters are inserted into the encoder to allow parameter-efficient fine-tuning. Preliminary experiments on various downstream tasks, such as camouflaged object detection, salient object detection, marine animal segmentation, mirror detection, and polyp segmentation, demonstrate that our SAM2-UNet can simply beat existing specialized state-of-the-art methods without bells and whistles. Project page: \url{https://github.com/WZH0120/SAM2-UNet}.</abstract><cop>Ithaca</cop><pub>Cornell University Library, arXiv.org</pub><oa>free_for_read</oa></addata></record>
fulltext fulltext
identifier EISSN: 2331-8422
ispartof arXiv.org, 2024-08
issn 2331-8422
language eng
recordid cdi_proquest_journals_3094565596
source Publicly Available Content Database
subjects Coders
Design parameters
Image segmentation
Marine animals
Medical imaging
Object recognition
Salience
title SAM2-UNet: Segment Anything 2 Makes Strong Encoder for Natural and Medical Image Segmentation
url http://sfxeu10.hosted.exlibrisgroup.com/loughborough?ctx_ver=Z39.88-2004&ctx_enc=info:ofi/enc:UTF-8&ctx_tim=2024-12-28T10%3A35%3A55IST&url_ver=Z39.88-2004&url_ctx_fmt=infofi/fmt:kev:mtx:ctx&rfr_id=info:sid/primo.exlibrisgroup.com:primo3-Article-proquest&rft_val_fmt=info:ofi/fmt:kev:mtx:book&rft.genre=document&rft.atitle=SAM2-UNet:%20Segment%20Anything%202%20Makes%20Strong%20Encoder%20for%20Natural%20and%20Medical%20Image%20Segmentation&rft.jtitle=arXiv.org&rft.au=Xiong,%20Xinyu&rft.date=2024-08-16&rft.eissn=2331-8422&rft_id=info:doi/&rft_dat=%3Cproquest%3E3094565596%3C/proquest%3E%3Cgrp_id%3Ecdi_FETCH-proquest_journals_30945655963%3C/grp_id%3E%3Coa%3E%3C/oa%3E%3Curl%3E%3C/url%3E&rft_id=info:oai/&rft_pqid=3094565596&rft_id=info:pmid/&rfr_iscdi=true