Loading…
HVT-cGAN: Hybrid Vision Transformer cGAN for SAR-to-Optical Image Translation
Due to its capability for all-weather, all-time information acquisition, Synthetic aperture radar (SAR) plays a vital role in the field of Earth observation. However, the specificity of the radar sensor and the complexity of electromagnetic scattering imaging physics result in SAR images lacking the...
Saved in:
Published in: | IEEE transactions on geoscience and remote sensing 2024-12, p.1-1 |
---|---|
Main Authors: | , , , |
Format: | Article |
Language: | English |
Subjects: | |
Online Access: | Get full text |
Tags: |
Add Tag
No Tags, Be the first to tag this record!
|
Summary: | Due to its capability for all-weather, all-time information acquisition, Synthetic aperture radar (SAR) plays a vital role in the field of Earth observation. However, the specificity of the radar sensor and the complexity of electromagnetic scattering imaging physics result in SAR images lacking the intuitiveness of optical images, making them unsuitable for interpretation by non-experts. A common approach to tackle this challenge is to use conditional generative adversarial network (cGAN) to translate SAR images into optical images, thereby enhancing readability and assisting non-experts in interpretation while filling the gaps in optical data due to acquisition constraints. Nevertheless, traditional cGAN-based methods are limited by inadequate global semantic information extraction and poor detail preservation, leading to translated images with incoherent texture and color, and blurred edge. To address these issues, we propose a hybrid vision transformer conditional generative adversarial network (HVT-cGAN) for SAR-to-optical image translation (S2OIT). In our proposed HVT-cGAN, the generator utilizes a convolutional stem for patch embedding and encoding. The parallel CNN branch and vision transformer branch are employed for the extraction and mapping of local and global information, respectively. Moreover, we propose a novel attention-based feature fusion module, named the Convolutional Attention Fusion Module (CAFM), which can adaptively aggregate local and global information from parallel branches by learning both channel-wise and spatial-wise relations. Benefiting from these improvements, our method achieves superior performance in both qualitative and quantitative comparisons with other methods on the SEN1-2 dataset. Additionally, the results of multiple ablation experiments validate the effectiveness of the proposed method. |
---|---|
ISSN: | 0196-2892 1558-0644 |
DOI: | 10.1109/TGRS.2024.3523040 |