Loading…
SAM2-UNet: Segment Anything 2 Makes Strong Encoder for Natural and Medical Image Segmentation
Image segmentation plays an important role in vision understanding. Recently, the emerging vision foundation models continuously achieved superior performance on various tasks. Following such success, in this paper, we prove that the Segment Anything Model 2 (SAM2) can be a strong encoder for U-shap...
Saved in:
Published in: | arXiv.org 2024-08 |
---|---|
Main Authors: | , , , , , , , , |
Format: | Article |
Language: | English |
Subjects: | |
Online Access: | Get full text |
Tags: |
Add Tag
No Tags, Be the first to tag this record!
|
cited_by | |
---|---|
cites | |
container_end_page | |
container_issue | |
container_start_page | |
container_title | arXiv.org |
container_volume | |
creator | Xiong, Xinyu Wu, Zihuang Tan, Shuangyi Li, Wenxue Tang, Feilong Chen, Ying Li, Siying Ma, Jie Li, Guanbin |
description | Image segmentation plays an important role in vision understanding. Recently, the emerging vision foundation models continuously achieved superior performance on various tasks. Following such success, in this paper, we prove that the Segment Anything Model 2 (SAM2) can be a strong encoder for U-shaped segmentation models. We propose a simple but effective framework, termed SAM2-UNet, for versatile image segmentation. Specifically, SAM2-UNet adopts the Hiera backbone of SAM2 as the encoder, while the decoder uses the classic U-shaped design. Additionally, adapters are inserted into the encoder to allow parameter-efficient fine-tuning. Preliminary experiments on various downstream tasks, such as camouflaged object detection, salient object detection, marine animal segmentation, mirror detection, and polyp segmentation, demonstrate that our SAM2-UNet can simply beat existing specialized state-of-the-art methods without bells and whistles. Project page: \url{https://github.com/WZH0120/SAM2-UNet}. |
format | article |
fullrecord | <record><control><sourceid>proquest</sourceid><recordid>TN_cdi_proquest_journals_3094565596</recordid><sourceformat>XML</sourceformat><sourcesystem>PC</sourcesystem><sourcerecordid>3094565596</sourcerecordid><originalsourceid>FETCH-proquest_journals_30945655963</originalsourceid><addsrcrecordid>eNqNjLEOgjAURRsTE4nyDy9xJsGWorgRg9EBFnQ0pIEHgtBqWwb_Xgbdne49uSd3RhzK2MbbBZQuiGtM5_s-DbeUc-aQWx6n1LtmaPeQYzOgtBDLt723sgEKqXiggdxqNWEiS1WhhlppyIQdtehByApSrNpy6udBNPh7EbZVckXmtegNut9ckvUxuRxO3lOr14jGFp0atZymgvlRwEPOo5D9Z30A3jBCBg</addsrcrecordid><sourcetype>Aggregation Database</sourcetype><iscdi>true</iscdi><recordtype>article</recordtype><pqid>3094565596</pqid></control><display><type>article</type><title>SAM2-UNet: Segment Anything 2 Makes Strong Encoder for Natural and Medical Image Segmentation</title><source>Publicly Available Content Database</source><creator>Xiong, Xinyu ; Wu, Zihuang ; Tan, Shuangyi ; Li, Wenxue ; Tang, Feilong ; Chen, Ying ; Li, Siying ; Ma, Jie ; Li, Guanbin</creator><creatorcontrib>Xiong, Xinyu ; Wu, Zihuang ; Tan, Shuangyi ; Li, Wenxue ; Tang, Feilong ; Chen, Ying ; Li, Siying ; Ma, Jie ; Li, Guanbin</creatorcontrib><description>Image segmentation plays an important role in vision understanding. Recently, the emerging vision foundation models continuously achieved superior performance on various tasks. Following such success, in this paper, we prove that the Segment Anything Model 2 (SAM2) can be a strong encoder for U-shaped segmentation models. We propose a simple but effective framework, termed SAM2-UNet, for versatile image segmentation. Specifically, SAM2-UNet adopts the Hiera backbone of SAM2 as the encoder, while the decoder uses the classic U-shaped design. Additionally, adapters are inserted into the encoder to allow parameter-efficient fine-tuning. Preliminary experiments on various downstream tasks, such as camouflaged object detection, salient object detection, marine animal segmentation, mirror detection, and polyp segmentation, demonstrate that our SAM2-UNet can simply beat existing specialized state-of-the-art methods without bells and whistles. Project page: \url{https://github.com/WZH0120/SAM2-UNet}.</description><identifier>EISSN: 2331-8422</identifier><language>eng</language><publisher>Ithaca: Cornell University Library, arXiv.org</publisher><subject>Coders ; Design parameters ; Image segmentation ; Marine animals ; Medical imaging ; Object recognition ; Salience</subject><ispartof>arXiv.org, 2024-08</ispartof><rights>2024. This work is published under http://arxiv.org/licenses/nonexclusive-distrib/1.0/ (the “License”). Notwithstanding the ProQuest Terms and Conditions, you may use this content in accordance with the terms of the License.</rights><oa>free_for_read</oa><woscitedreferencessubscribed>false</woscitedreferencessubscribed></display><links><openurl>$$Topenurl_article</openurl><openurlfulltext>$$Topenurlfull_article</openurlfulltext><thumbnail>$$Tsyndetics_thumb_exl</thumbnail><linktohtml>$$Uhttps://www.proquest.com/docview/3094565596?pq-origsite=primo$$EHTML$$P50$$Gproquest$$Hfree_for_read</linktohtml><link.rule.ids>780,784,25753,37012,44590</link.rule.ids></links><search><creatorcontrib>Xiong, Xinyu</creatorcontrib><creatorcontrib>Wu, Zihuang</creatorcontrib><creatorcontrib>Tan, Shuangyi</creatorcontrib><creatorcontrib>Li, Wenxue</creatorcontrib><creatorcontrib>Tang, Feilong</creatorcontrib><creatorcontrib>Chen, Ying</creatorcontrib><creatorcontrib>Li, Siying</creatorcontrib><creatorcontrib>Ma, Jie</creatorcontrib><creatorcontrib>Li, Guanbin</creatorcontrib><title>SAM2-UNet: Segment Anything 2 Makes Strong Encoder for Natural and Medical Image Segmentation</title><title>arXiv.org</title><description>Image segmentation plays an important role in vision understanding. Recently, the emerging vision foundation models continuously achieved superior performance on various tasks. Following such success, in this paper, we prove that the Segment Anything Model 2 (SAM2) can be a strong encoder for U-shaped segmentation models. We propose a simple but effective framework, termed SAM2-UNet, for versatile image segmentation. Specifically, SAM2-UNet adopts the Hiera backbone of SAM2 as the encoder, while the decoder uses the classic U-shaped design. Additionally, adapters are inserted into the encoder to allow parameter-efficient fine-tuning. Preliminary experiments on various downstream tasks, such as camouflaged object detection, salient object detection, marine animal segmentation, mirror detection, and polyp segmentation, demonstrate that our SAM2-UNet can simply beat existing specialized state-of-the-art methods without bells and whistles. Project page: \url{https://github.com/WZH0120/SAM2-UNet}.</description><subject>Coders</subject><subject>Design parameters</subject><subject>Image segmentation</subject><subject>Marine animals</subject><subject>Medical imaging</subject><subject>Object recognition</subject><subject>Salience</subject><issn>2331-8422</issn><fulltext>true</fulltext><rsrctype>article</rsrctype><creationdate>2024</creationdate><recordtype>article</recordtype><sourceid>PIMPY</sourceid><recordid>eNqNjLEOgjAURRsTE4nyDy9xJsGWorgRg9EBFnQ0pIEHgtBqWwb_Xgbdne49uSd3RhzK2MbbBZQuiGtM5_s-DbeUc-aQWx6n1LtmaPeQYzOgtBDLt723sgEKqXiggdxqNWEiS1WhhlppyIQdtehByApSrNpy6udBNPh7EbZVckXmtegNut9ckvUxuRxO3lOr14jGFp0atZymgvlRwEPOo5D9Z30A3jBCBg</recordid><startdate>20240816</startdate><enddate>20240816</enddate><creator>Xiong, Xinyu</creator><creator>Wu, Zihuang</creator><creator>Tan, Shuangyi</creator><creator>Li, Wenxue</creator><creator>Tang, Feilong</creator><creator>Chen, Ying</creator><creator>Li, Siying</creator><creator>Ma, Jie</creator><creator>Li, Guanbin</creator><general>Cornell University Library, arXiv.org</general><scope>8FE</scope><scope>8FG</scope><scope>ABJCF</scope><scope>ABUWG</scope><scope>AFKRA</scope><scope>AZQEC</scope><scope>BENPR</scope><scope>BGLVJ</scope><scope>CCPQU</scope><scope>DWQXO</scope><scope>HCIFZ</scope><scope>L6V</scope><scope>M7S</scope><scope>PIMPY</scope><scope>PQEST</scope><scope>PQQKQ</scope><scope>PQUKI</scope><scope>PRINS</scope><scope>PTHSS</scope></search><sort><creationdate>20240816</creationdate><title>SAM2-UNet: Segment Anything 2 Makes Strong Encoder for Natural and Medical Image Segmentation</title><author>Xiong, Xinyu ; Wu, Zihuang ; Tan, Shuangyi ; Li, Wenxue ; Tang, Feilong ; Chen, Ying ; Li, Siying ; Ma, Jie ; Li, Guanbin</author></sort><facets><frbrtype>5</frbrtype><frbrgroupid>cdi_FETCH-proquest_journals_30945655963</frbrgroupid><rsrctype>articles</rsrctype><prefilter>articles</prefilter><language>eng</language><creationdate>2024</creationdate><topic>Coders</topic><topic>Design parameters</topic><topic>Image segmentation</topic><topic>Marine animals</topic><topic>Medical imaging</topic><topic>Object recognition</topic><topic>Salience</topic><toplevel>online_resources</toplevel><creatorcontrib>Xiong, Xinyu</creatorcontrib><creatorcontrib>Wu, Zihuang</creatorcontrib><creatorcontrib>Tan, Shuangyi</creatorcontrib><creatorcontrib>Li, Wenxue</creatorcontrib><creatorcontrib>Tang, Feilong</creatorcontrib><creatorcontrib>Chen, Ying</creatorcontrib><creatorcontrib>Li, Siying</creatorcontrib><creatorcontrib>Ma, Jie</creatorcontrib><creatorcontrib>Li, Guanbin</creatorcontrib><collection>ProQuest SciTech Collection</collection><collection>ProQuest Technology Collection</collection><collection>Materials Science & Engineering Collection</collection><collection>ProQuest Central (Alumni)</collection><collection>ProQuest Central</collection><collection>ProQuest Central Essentials</collection><collection>AUTh Library subscriptions: ProQuest Central</collection><collection>Technology Collection</collection><collection>ProQuest One Community College</collection><collection>ProQuest Central Korea</collection><collection>SciTech Premium Collection</collection><collection>ProQuest Engineering Collection</collection><collection>ProQuest Engineering Database</collection><collection>Publicly Available Content Database</collection><collection>ProQuest One Academic Eastern Edition (DO NOT USE)</collection><collection>ProQuest One Academic</collection><collection>ProQuest One Academic UKI Edition</collection><collection>ProQuest Central China</collection><collection>Engineering collection</collection></facets><delivery><delcategory>Remote Search Resource</delcategory><fulltext>fulltext</fulltext></delivery><addata><au>Xiong, Xinyu</au><au>Wu, Zihuang</au><au>Tan, Shuangyi</au><au>Li, Wenxue</au><au>Tang, Feilong</au><au>Chen, Ying</au><au>Li, Siying</au><au>Ma, Jie</au><au>Li, Guanbin</au><format>book</format><genre>document</genre><ristype>GEN</ristype><atitle>SAM2-UNet: Segment Anything 2 Makes Strong Encoder for Natural and Medical Image Segmentation</atitle><jtitle>arXiv.org</jtitle><date>2024-08-16</date><risdate>2024</risdate><eissn>2331-8422</eissn><abstract>Image segmentation plays an important role in vision understanding. Recently, the emerging vision foundation models continuously achieved superior performance on various tasks. Following such success, in this paper, we prove that the Segment Anything Model 2 (SAM2) can be a strong encoder for U-shaped segmentation models. We propose a simple but effective framework, termed SAM2-UNet, for versatile image segmentation. Specifically, SAM2-UNet adopts the Hiera backbone of SAM2 as the encoder, while the decoder uses the classic U-shaped design. Additionally, adapters are inserted into the encoder to allow parameter-efficient fine-tuning. Preliminary experiments on various downstream tasks, such as camouflaged object detection, salient object detection, marine animal segmentation, mirror detection, and polyp segmentation, demonstrate that our SAM2-UNet can simply beat existing specialized state-of-the-art methods without bells and whistles. Project page: \url{https://github.com/WZH0120/SAM2-UNet}.</abstract><cop>Ithaca</cop><pub>Cornell University Library, arXiv.org</pub><oa>free_for_read</oa></addata></record> |
fulltext | fulltext |
identifier | EISSN: 2331-8422 |
ispartof | arXiv.org, 2024-08 |
issn | 2331-8422 |
language | eng |
recordid | cdi_proquest_journals_3094565596 |
source | Publicly Available Content Database |
subjects | Coders Design parameters Image segmentation Marine animals Medical imaging Object recognition Salience |
title | SAM2-UNet: Segment Anything 2 Makes Strong Encoder for Natural and Medical Image Segmentation |
url | http://sfxeu10.hosted.exlibrisgroup.com/loughborough?ctx_ver=Z39.88-2004&ctx_enc=info:ofi/enc:UTF-8&ctx_tim=2024-12-28T10%3A35%3A55IST&url_ver=Z39.88-2004&url_ctx_fmt=infofi/fmt:kev:mtx:ctx&rfr_id=info:sid/primo.exlibrisgroup.com:primo3-Article-proquest&rft_val_fmt=info:ofi/fmt:kev:mtx:book&rft.genre=document&rft.atitle=SAM2-UNet:%20Segment%20Anything%202%20Makes%20Strong%20Encoder%20for%20Natural%20and%20Medical%20Image%20Segmentation&rft.jtitle=arXiv.org&rft.au=Xiong,%20Xinyu&rft.date=2024-08-16&rft.eissn=2331-8422&rft_id=info:doi/&rft_dat=%3Cproquest%3E3094565596%3C/proquest%3E%3Cgrp_id%3Ecdi_FETCH-proquest_journals_30945655963%3C/grp_id%3E%3Coa%3E%3C/oa%3E%3Curl%3E%3C/url%3E&rft_id=info:oai/&rft_pqid=3094565596&rft_id=info:pmid/&rfr_iscdi=true |