Loading…

Robust Optical and SAR Image Matching Using Attention-Enhanced Structural Features

Due to the complementary nature of optical and SAR images, their alignment is of increasing interest. However, due to the significant radiometric differences between them, precise matching becomes a very challenging problem. Although current advanced structural features and deep learning-based metho...

Full description

Saved in:
Bibliographic Details
Published in:IEEE transactions on geoscience and remote sensing 2024-01, Vol.62, p.1-1
Main Authors: Ye, Yuanxin, Yang, Chao, Gong, Guoqing, Yang, Peizhen, Quan, Dou, Li, Jiayuan
Format: Article
Language:English
Subjects:
Citations: Items that this one cites
Items that cite this one
Online Access:Get full text
Tags: Add Tag
No Tags, Be the first to tag this record!
Description
Summary:Due to the complementary nature of optical and SAR images, their alignment is of increasing interest. However, due to the significant radiometric differences between them, precise matching becomes a very challenging problem. Although current advanced structural features and deep learning-based methods have proposed feasible solutions, there is still much potential for improvement. In this paper, we propose a hybrid matching method using attention-enhanced structural features (namely AESF), which combines the advantages of both handcrafted-based and learning-based methods to improve the accuracy of optical and SAR image matching. It mainly consists of two modules: a novel effective multi-branch global attention (MBGA) module and a joint multi-cropping image matching loss function (MCTM) module. The MBGA module is designed to focus on shared information in structural feature descriptors of heterogeneous images across space and channel dimensions, significantly improving the expressive capacity of the classical structural features and generating more refined and robust image features. The MCTM module is constructed to fully exploit the association between global and local information of the input image, which can optimize the triple loss discriminator to discriminate positive and negative samples. To validate the effectiveness of the proposed method, it is compared with five state-of-the-art matching methods by using various optical and SAR datasets. The experimental results show that the matching accuracy at the 1-pixel threshold is improved by about 1.8%-8.7% compared with the most advanced deep learning method (OSMNet) and 6.5%-23% compared with the handcrafted description method (CFOG).
ISSN:0196-2892
1558-0644
DOI:10.1109/TGRS.2024.3366247