Loading…

DDRNet: Dual-Domain Refinement Network for Remote Sensing Image Semantic Segmentation

Semantic segmentation is crucial for interpreting remote sensing images. The segmentation performance has been significantly improved recently with the development of deep learning. However, complex background samples and small objects greatly increase the challenge of the semantic segmentation task...

Full description

Saved in:
Bibliographic Details
Published in:IEEE journal of selected topics in applied earth observations and remote sensing 2024, Vol.17, p.20177-20189
Main Authors: Yang, Zhenhao, Bi, Fukun, Hou, Xinghai, Zhou, Dehao, Wang, Yanping
Format: Article
Language:English
Subjects:
Citations: Items that this one cites
Online Access:Get full text
Tags: Add Tag
No Tags, Be the first to tag this record!
Description
Summary:Semantic segmentation is crucial for interpreting remote sensing images. The segmentation performance has been significantly improved recently with the development of deep learning. However, complex background samples and small objects greatly increase the challenge of the semantic segmentation task for remote sensing images. To address these challenges, we propose a dual-domain refinement network (DDRNet) for accurate segmentation. Specifically, we first propose a spatial and frequency feature reconstruction module, which separately utilizes the characteristics of the frequency and spatial domains to refine the global salient features and the fine-grained spatial features of objects. This process enhances the foreground saliency and adaptively suppresses background noise. Subsequently, we propose a feature alignment module that selectively couples the features refined from both domains via cross-attention, achieving semantic alignment between frequency and spatial domains. In addition, a meticulously designed detail-aware attention module is introduced to compensate for the loss of small objects during feature propagation. This module leverages cross-correlation matrices between high-level features and the original image to quantify the similarities among objects belonging to the same category, thereby transmitting rich semantic information from high-level features to small objects. The results on multiple datasets validate that our method outperforms the existing methods and achieves a good compromise between computational overhead and accuracy.
ISSN:1939-1404
2151-1535
DOI:10.1109/JSTARS.2024.3490584