Loading…

A Siamese Network Based on Multiple Attention and Multilayer Transformers for Change Detection

Deep learning (DL) networks have demonstrated promising performance in high-resolution remote sensing (RS) image change detection (CD). The transformer can enhance the features and capture the global semantic relations, which has been used to solve the CD problem for high-resolution remote sensing i...

Full description

Saved in:

Bibliographic Details
Published in:	IEEE transactions on geoscience and remote sensing 2023, Vol.61, p.1-15
Main Authors:	Tang, Wenjie, Wu, Ke, Zhang, Yuxiang, Zhan, Yanting
Format:	Article
Language:	English
Subjects:	Artificial neural networks Change detection Change detection (CD) channel attention Coders Decoding Deep learning deep learning (DL) Detection Feature extraction High resolution high-resolution remote sensing (RS) images Image enhancement Image resolution Immunoglobulin G Modules Multilayers Remote sensing self-attention Semantic relations Semantics Sensors spatial attention Task analysis transformer Transformers
Citations:	Items that this one cites Items that cite this one
Online Access:	Get full text
Tags:	Add Tag No Tags, Be the first to tag this record!

Description
Summary:	Deep learning (DL) networks have demonstrated promising performance in high-resolution remote sensing (RS) image change detection (CD). The transformer can enhance the features and capture the global semantic relations, which has been used to solve the CD problem for high-resolution remote sensing images with good results. However, the depth of the transformer is limited, and the extracted features are not representative, which makes the performance of the CD model unsatisfied. To fix this problem, we propose a Siamese network based on multiple attention and multilayer transformers (SMARTs) for CD in this article. It is a Siamese network containing three different modules, which can process bitemporal images in parallel and extract enhanced features at different levels. The first is the feature extraction module. It expresses the features as a certain number of high-order semantic features through the spatial attention module (SPAM), followed by the calculation of the semantic relations between these high-order semantic features using the transformer encoder, which greatly improves the computational efficiency. The second is the feature enhancement module. It computes global semantic relations with a self-attention module (SFAM). The multilayer encoder gets the enhanced features at different levels by computing the relationship between features at each layer. The multilayer decoder refines the bitemporal features of each layer and projects them back to the original space. The third is the fusion module. It uses the ensemble channel attention module (ECAM) to elaborate the feature differences at different levels. The proposed SMART model has been compared with some state-of-the-art CD methods in three publicly available datasets. The results confirm that SMART outperforms state-of-the-art CD methods on several evaluation metrics. Our code is available at https://github.com/TwJ-IGG/SMART
ISSN:	0196-2892 1558-0644
DOI:	10.1109/TGRS.2023.3325220