Loading…

Enrich, Distill and Fuse: Generalized Few-Shot Semantic Segmentation in Remote Sensing Leveraging Foundation Model's Assistance

Generalized few-shot semantic segmentation (GFSS) unifies semantic segmentation with few-shot learning, showing great potential for Earth observation tasks under data scarcity conditions, such as disaster response, urban planning, and natural resource management. GFSS requires simultaneous predictio...

Full description

Saved in:

Bibliographic Details
Main Authors:	Gao, Tianyi, Ao, Wei, Wang, Xing-ao, Zhao, Yuanhao, Ma, Ping, Xie, Mengjie, Fu, Hang, Ren, Jinchang, Gao, Zhi
Format:	Conference Proceeding
Language:	English
Subjects:	Annotations Fuses Image recognition Natural resources Semantic segmentation Semantics Urban planning
Online Access:	Request full text
Tags:	Add Tag No Tags, Be the first to tag this record!

cited_by
cites
container_end_page	2780
container_issue
container_start_page	2771
container_title
container_volume
creator	Gao, Tianyi Ao, Wei Wang, Xing-ao Zhao, Yuanhao Ma, Ping Xie, Mengjie Fu, Hang Ren, Jinchang Gao, Zhi
description	Generalized few-shot semantic segmentation (GFSS) unifies semantic segmentation with few-shot learning, showing great potential for Earth observation tasks under data scarcity conditions, such as disaster response, urban planning, and natural resource management. GFSS requires simultaneous prediction for both base and novel classes, with the challenge lying in balancing the segmentation performance of both. Therefore, this paper introduces a novel framework named FoMA, Foundation Model Assisted GFSS framework for remote sensing images. We aim to leverage the generic semantic knowledge inherited in foundation models. Specifically, we employ three strategies named Support Label Enrichment (SLE), Distillation of General Knowledge (DGK) and Voting Fusion of Experts (VFE). For the support images, SLE explores credible unlabeled novel categories, ensuring that each support label contains multiple novel classes. For the query images, DGK technique allows an effective transfer of generalizable knowledge of foundation models on certain categories to the GFSS learner. Additionally, VFE strategy integrates the zero-shot prediction of foundation models with the few-shot prediction of GFSS learners, achieving improved segmentation performance. Extensive experiments and ablation studies conducted on the OpenEarthMap few-shot challenge dataset demonstrate that our proposed method achieves state-of-the-art performance.
doi_str_mv	10.1109/CVPRW63382.2024.00283
format	conference_proceeding
fullrecord	<record><control><sourceid>ieee_CHZPO</sourceid><recordid>TN_cdi_ieee_primary_10677912</recordid><sourceformat>XML</sourceformat><sourcesystem>PC</sourcesystem><ieee_id>10677912</ieee_id><sourcerecordid>10677912</sourcerecordid><originalsourceid>FETCH-ieee_primary_106779123</originalsourceid><addsrcrecordid>eNqFjUtLxDAUhaMgOGj_gUJ2bmy9SfqKOxmnulCQGdHlENprJ5LeSpNRxo1_3YjuXZ2P8-AwdiogEwL0xfzpYflcKlXLTILMMwBZqz2W6ErXqgBVFnmV77OZFCWkVSHKQ5Z4_woAAuqi0GrGvhY02XZzzq-tD9Y5bqjjzdbjJb9Bwsk4-4nRwY90tRkDX-FgKNg2Qj8gBRPsSNwSX-IwBow2eUs9v8P3OO5_sBm31P327scO3ZnnV97HO0MtHrODF-M8Jn96xE6axeP8NrWIuH6b7GCm3VpAWVVaSPVP_A0YAFNW</addsrcrecordid><sourcetype>Publisher</sourcetype><iscdi>true</iscdi><recordtype>conference_proceeding</recordtype></control><display><type>conference_proceeding</type><title>Enrich, Distill and Fuse: Generalized Few-Shot Semantic Segmentation in Remote Sensing Leveraging Foundation Model's Assistance</title><source>IEEE Xplore All Conference Series</source><creator>Gao, Tianyi ; Ao, Wei ; Wang, Xing-ao ; Zhao, Yuanhao ; Ma, Ping ; Xie, Mengjie ; Fu, Hang ; Ren, Jinchang ; Gao, Zhi</creator><creatorcontrib>Gao, Tianyi ; Ao, Wei ; Wang, Xing-ao ; Zhao, Yuanhao ; Ma, Ping ; Xie, Mengjie ; Fu, Hang ; Ren, Jinchang ; Gao, Zhi</creatorcontrib><description>Generalized few-shot semantic segmentation (GFSS) unifies semantic segmentation with few-shot learning, showing great potential for Earth observation tasks under data scarcity conditions, such as disaster response, urban planning, and natural resource management. GFSS requires simultaneous prediction for both base and novel classes, with the challenge lying in balancing the segmentation performance of both. Therefore, this paper introduces a novel framework named FoMA, Foundation Model Assisted GFSS framework for remote sensing images. We aim to leverage the generic semantic knowledge inherited in foundation models. Specifically, we employ three strategies named Support Label Enrichment (SLE), Distillation of General Knowledge (DGK) and Voting Fusion of Experts (VFE). For the support images, SLE explores credible unlabeled novel categories, ensuring that each support label contains multiple novel classes. For the query images, DGK technique allows an effective transfer of generalizable knowledge of foundation models on certain categories to the GFSS learner. Additionally, VFE strategy integrates the zero-shot prediction of foundation models with the few-shot prediction of GFSS learners, achieving improved segmentation performance. Extensive experiments and ablation studies conducted on the OpenEarthMap few-shot challenge dataset demonstrate that our proposed method achieves state-of-the-art performance.</description><identifier>EISSN: 2160-7516</identifier><identifier>EISBN: 9798350365474</identifier><identifier>DOI: 10.1109/CVPRW63382.2024.00283</identifier><identifier>CODEN: IEEPAD</identifier><language>eng</language><publisher>IEEE</publisher><subject>Annotations ; Fuses ; Image recognition ; Natural resources ; Semantic segmentation ; Semantics ; Urban planning</subject><ispartof>2024 IEEE/CVF Conference on Computer Vision and Pattern Recognition Workshops (CVPRW), 2024, p.2771-2780</ispartof><woscitedreferencessubscribed>false</woscitedreferencessubscribed></display><links><openurl>$$Topenurl_article</openurl><openurlfulltext>$$Topenurlfull_article</openurlfulltext><thumbnail>$$Tsyndetics_thumb_exl</thumbnail><linktohtml>$$Uhttps://ieeexplore.ieee.org/document/10677912$$EHTML$$P50$$Gieee$$H</linktohtml><link.rule.ids>309,310,780,784,789,790,27925,54555,54932</link.rule.ids><linktorsrc>$$Uhttps://ieeexplore.ieee.org/document/10677912$$EView_record_in_IEEE$$FView_record_in_$$GIEEE</linktorsrc></links><search><creatorcontrib>Gao, Tianyi</creatorcontrib><creatorcontrib>Ao, Wei</creatorcontrib><creatorcontrib>Wang, Xing-ao</creatorcontrib><creatorcontrib>Zhao, Yuanhao</creatorcontrib><creatorcontrib>Ma, Ping</creatorcontrib><creatorcontrib>Xie, Mengjie</creatorcontrib><creatorcontrib>Fu, Hang</creatorcontrib><creatorcontrib>Ren, Jinchang</creatorcontrib><creatorcontrib>Gao, Zhi</creatorcontrib><title>Enrich, Distill and Fuse: Generalized Few-Shot Semantic Segmentation in Remote Sensing Leveraging Foundation Model's Assistance</title><title>2024 IEEE/CVF Conference on Computer Vision and Pattern Recognition Workshops (CVPRW)</title><addtitle>CVPRW</addtitle><description>Generalized few-shot semantic segmentation (GFSS) unifies semantic segmentation with few-shot learning, showing great potential for Earth observation tasks under data scarcity conditions, such as disaster response, urban planning, and natural resource management. GFSS requires simultaneous prediction for both base and novel classes, with the challenge lying in balancing the segmentation performance of both. Therefore, this paper introduces a novel framework named FoMA, Foundation Model Assisted GFSS framework for remote sensing images. We aim to leverage the generic semantic knowledge inherited in foundation models. Specifically, we employ three strategies named Support Label Enrichment (SLE), Distillation of General Knowledge (DGK) and Voting Fusion of Experts (VFE). For the support images, SLE explores credible unlabeled novel categories, ensuring that each support label contains multiple novel classes. For the query images, DGK technique allows an effective transfer of generalizable knowledge of foundation models on certain categories to the GFSS learner. Additionally, VFE strategy integrates the zero-shot prediction of foundation models with the few-shot prediction of GFSS learners, achieving improved segmentation performance. Extensive experiments and ablation studies conducted on the OpenEarthMap few-shot challenge dataset demonstrate that our proposed method achieves state-of-the-art performance.</description><subject>Annotations</subject><subject>Fuses</subject><subject>Image recognition</subject><subject>Natural resources</subject><subject>Semantic segmentation</subject><subject>Semantics</subject><subject>Urban planning</subject><issn>2160-7516</issn><isbn>9798350365474</isbn><fulltext>true</fulltext><rsrctype>conference_proceeding</rsrctype><creationdate>2024</creationdate><recordtype>conference_proceeding</recordtype><sourceid>6IE</sourceid><recordid>eNqFjUtLxDAUhaMgOGj_gUJ2bmy9SfqKOxmnulCQGdHlENprJ5LeSpNRxo1_3YjuXZ2P8-AwdiogEwL0xfzpYflcKlXLTILMMwBZqz2W6ErXqgBVFnmV77OZFCWkVSHKQ5Z4_woAAuqi0GrGvhY02XZzzq-tD9Y5bqjjzdbjJb9Bwsk4-4nRwY90tRkDX-FgKNg2Qj8gBRPsSNwSX-IwBow2eUs9v8P3OO5_sBm31P327scO3ZnnV97HO0MtHrODF-M8Jn96xE6axeP8NrWIuH6b7GCm3VpAWVVaSPVP_A0YAFNW</recordid><startdate>20240617</startdate><enddate>20240617</enddate><creator>Gao, Tianyi</creator><creator>Ao, Wei</creator><creator>Wang, Xing-ao</creator><creator>Zhao, Yuanhao</creator><creator>Ma, Ping</creator><creator>Xie, Mengjie</creator><creator>Fu, Hang</creator><creator>Ren, Jinchang</creator><creator>Gao, Zhi</creator><general>IEEE</general><scope>6IE</scope><scope>6IL</scope><scope>CBEJK</scope><scope>RIE</scope><scope>RIL</scope></search><sort><creationdate>20240617</creationdate><title>Enrich, Distill and Fuse: Generalized Few-Shot Semantic Segmentation in Remote Sensing Leveraging Foundation Model's Assistance</title><author>Gao, Tianyi ; Ao, Wei ; Wang, Xing-ao ; Zhao, Yuanhao ; Ma, Ping ; Xie, Mengjie ; Fu, Hang ; Ren, Jinchang ; Gao, Zhi</author></sort><facets><frbrtype>5</frbrtype><frbrgroupid>cdi_FETCH-ieee_primary_106779123</frbrgroupid><rsrctype>conference_proceedings</rsrctype><prefilter>conference_proceedings</prefilter><language>eng</language><creationdate>2024</creationdate><topic>Annotations</topic><topic>Fuses</topic><topic>Image recognition</topic><topic>Natural resources</topic><topic>Semantic segmentation</topic><topic>Semantics</topic><topic>Urban planning</topic><toplevel>online_resources</toplevel><creatorcontrib>Gao, Tianyi</creatorcontrib><creatorcontrib>Ao, Wei</creatorcontrib><creatorcontrib>Wang, Xing-ao</creatorcontrib><creatorcontrib>Zhao, Yuanhao</creatorcontrib><creatorcontrib>Ma, Ping</creatorcontrib><creatorcontrib>Xie, Mengjie</creatorcontrib><creatorcontrib>Fu, Hang</creatorcontrib><creatorcontrib>Ren, Jinchang</creatorcontrib><creatorcontrib>Gao, Zhi</creatorcontrib><collection>IEEE Electronic Library (IEL) Conference Proceedings</collection><collection>IEEE Proceedings Order Plan All Online (POP All Online) 1998-present by volume</collection><collection>IEEE Xplore All Conference Proceedings</collection><collection>IEEE Xplore</collection><collection>IEEE Proceedings Order Plans (POP All) 1998-Present</collection></facets><delivery><delcategory>Remote Search Resource</delcategory><fulltext>fulltext_linktorsrc</fulltext></delivery><addata><au>Gao, Tianyi</au><au>Ao, Wei</au><au>Wang, Xing-ao</au><au>Zhao, Yuanhao</au><au>Ma, Ping</au><au>Xie, Mengjie</au><au>Fu, Hang</au><au>Ren, Jinchang</au><au>Gao, Zhi</au><format>book</format><genre>proceeding</genre><ristype>CONF</ristype><atitle>Enrich, Distill and Fuse: Generalized Few-Shot Semantic Segmentation in Remote Sensing Leveraging Foundation Model's Assistance</atitle><btitle>2024 IEEE/CVF Conference on Computer Vision and Pattern Recognition Workshops (CVPRW)</btitle><stitle>CVPRW</stitle><date>2024-06-17</date><risdate>2024</risdate><spage>2771</spage><epage>2780</epage><pages>2771-2780</pages><eissn>2160-7516</eissn><eisbn>9798350365474</eisbn><coden>IEEPAD</coden><abstract>Generalized few-shot semantic segmentation (GFSS) unifies semantic segmentation with few-shot learning, showing great potential for Earth observation tasks under data scarcity conditions, such as disaster response, urban planning, and natural resource management. GFSS requires simultaneous prediction for both base and novel classes, with the challenge lying in balancing the segmentation performance of both. Therefore, this paper introduces a novel framework named FoMA, Foundation Model Assisted GFSS framework for remote sensing images. We aim to leverage the generic semantic knowledge inherited in foundation models. Specifically, we employ three strategies named Support Label Enrichment (SLE), Distillation of General Knowledge (DGK) and Voting Fusion of Experts (VFE). For the support images, SLE explores credible unlabeled novel categories, ensuring that each support label contains multiple novel classes. For the query images, DGK technique allows an effective transfer of generalizable knowledge of foundation models on certain categories to the GFSS learner. Additionally, VFE strategy integrates the zero-shot prediction of foundation models with the few-shot prediction of GFSS learners, achieving improved segmentation performance. Extensive experiments and ablation studies conducted on the OpenEarthMap few-shot challenge dataset demonstrate that our proposed method achieves state-of-the-art performance.</abstract><pub>IEEE</pub><doi>10.1109/CVPRW63382.2024.00283</doi></addata></record>
fulltext	fulltext_linktorsrc
identifier	EISSN: 2160-7516
ispartof	2024 IEEE/CVF Conference on Computer Vision and Pattern Recognition Workshops (CVPRW), 2024, p.2771-2780
issn	2160-7516
language	eng
recordid	cdi_ieee_primary_10677912
source	IEEE Xplore All Conference Series
subjects	Annotations Fuses Image recognition Natural resources Semantic segmentation Semantics Urban planning
title	Enrich, Distill and Fuse: Generalized Few-Shot Semantic Segmentation in Remote Sensing Leveraging Foundation Model's Assistance
url	http://sfxeu10.hosted.exlibrisgroup.com/loughborough?ctx_ver=Z39.88-2004&ctx_enc=info:ofi/enc:UTF-8&ctx_tim=2024-12-26T23%3A50%3A14IST&url_ver=Z39.88-2004&url_ctx_fmt=infofi/fmt:kev:mtx:ctx&rfr_id=info:sid/primo.exlibrisgroup.com:primo3-Article-ieee_CHZPO&rft_val_fmt=info:ofi/fmt:kev:mtx:book&rft.genre=proceeding&rft.atitle=Enrich,%20Distill%20and%20Fuse:%20Generalized%20Few-Shot%20Semantic%20Segmentation%20in%20Remote%20Sensing%20Leveraging%20Foundation%20Model's%20Assistance&rft.btitle=2024%20IEEE/CVF%20Conference%20on%20Computer%20Vision%20and%20Pattern%20Recognition%20Workshops%20(CVPRW)&rft.au=Gao,%20Tianyi&rft.date=2024-06-17&rft.spage=2771&rft.epage=2780&rft.pages=2771-2780&rft.eissn=2160-7516&rft.coden=IEEPAD&rft_id=info:doi/10.1109/CVPRW63382.2024.00283&rft.eisbn=9798350365474&rft_dat=%3Cieee_CHZPO%3E10677912%3C/ieee_CHZPO%3E%3Cgrp_id%3Ecdi_FETCH-ieee_primary_106779123%3C/grp_id%3E%3Coa%3E%3C/oa%3E%3Curl%3E%3C/url%3E&rft_id=info:oai/&rft_id=info:pmid/&rft_ieee_id=10677912&rfr_iscdi=true