Loading…

MoCG: Modality Characteristics-Guided Semantic Segmentation in Multimodal Remote Sensing Images

The rapid development of satellite platforms has yielded copious and diverse multisource data for earth observation, greatly facilitating the growth of multimodal semantic segmentation (MSS) in remote sensing. However, MSS also suffers from numerous challenges: 1) existing inherent defects in each m...

Full description

Saved in:
Bibliographic Details
Published in:IEEE transactions on geoscience and remote sensing 2023, Vol.61, p.1-18
Main Authors: Xiao, Sining, Wang, Peijin, Diao, Wenhui, Rong, Xuee, Li, Xuexue, Fu, Kun, Sun, Xian
Format: Article
Language:English
Subjects:
Citations: Items that this one cites
Items that cite this one
Online Access:Get full text
Tags: Add Tag
No Tags, Be the first to tag this record!
Description
Summary:The rapid development of satellite platforms has yielded copious and diverse multisource data for earth observation, greatly facilitating the growth of multimodal semantic segmentation (MSS) in remote sensing. However, MSS also suffers from numerous challenges: 1) existing inherent defects in each modality due to the different imaging mechanisms; 2) insufficient exploration of the intrinsic characteristics of modalities; and 3) the existence of the huge semantic gap between heterogeneous data causes difficulties in feature fusion. The inability to effectively utilize the rich and diverse information provided by each modality and ignorance of the heterogeneity between modalities will hinder the feature enhancement, and further significantly impacts the semantic segmentation accuracy. Furthermore, neglecting the huge gap makes feature fusion challenging. In this study, we introduce a novel framework for MSS that effectively mitigates the aforementioned problems. Our approach employs a pseudo-Siamese structure for feature extraction. Specifically, we propose a simple yet effective geometric topology structure modeling (GTSM) module to extract geometric relationships and texture information from optical data. Additionally, we present a modality intrinsic noise suppression (MINS) module to fully exploit radiation information and alleviate the effects of unique geometric distortions for synthetic aperture radar (SAR). Furthermore, we present an adaptive multimodal feature fusion (AMFF) module for fully fusing different modality features. Extensive experiments on both WHU-OPT-SAR and DFC23 datasets validate the robustness and effectiveness of the proposed Modality Characteristics-Guided (MoCG) semantic segmentation network compared to other state-of-the-art semantic segmentation methods, including multimodal and single-modal approaches. Our approach achieves the best performance on both datasets, resulting in mean intersection over union (mIoU)/overall accuracy (OA) gains 69.1%/87.5% on WHU-OPT-SAR and 86.7%/97.3% on DFC23.
ISSN:0196-2892
1558-0644
DOI:10.1109/TGRS.2023.3334471