Loading…

An Attention Cascade Global–Local Network for Remote Sensing Scene Classification

Remote sensing image scene classification is an important task of remote sensing image interpretation, which has recently been well addressed by the convolutional neural network owing to its powerful learning ability. However, due to the multiple types of geographical information and redundant backg...

Full description

Saved in:

Bibliographic Details
Published in:	Remote sensing (Basel, Switzerland) Switzerland), 2022-05, Vol.14 (9), p.2042
Main Authors:	Shen, Junge, Yu, Tianwei, Yang, Haopeng, Wang, Ruxin, Wang, Qi
Format:	Article
Language:	English
Subjects:	Artificial neural networks Classification convolutional neural network Deep learning Feature extraction feature fusion Image classification neural architecture search Neural networks Popularity Remote sensing remote sensing scene classification Semantics
Citations:	Items that this one cites Items that cite this one
Online Access:	Get full text
Tags:	Add Tag No Tags, Be the first to tag this record!

Description
Summary:	Remote sensing image scene classification is an important task of remote sensing image interpretation, which has recently been well addressed by the convolutional neural network owing to its powerful learning ability. However, due to the multiple types of geographical information and redundant background information of the remote sensing images, most of the CNN-based methods, especially those based on a single CNN model and those ignoring the combination of global and local features, exhibit limited performance on accurate classification. To compensate for such insufficiency, we propose a new dual-model deep feature fusion method based on an attention cascade global–local network (ACGLNet). Specifically, we use two popular CNNs as the feature extractors to extract complementary multiscale features from the input image. Considering the characteristics of the global and local features, the proposed ACGLNet filters the redundant background information from the low-level features through the spatial attention mechanism, followed by which the locally attended features are fused with the high-level features. Then, bilinear fusion is employed to produce the fused representation of the dual model, which is finally fed to the classifier. Through extensive experiments on four public remote sensing scene datasets, including UCM, AID, PatternNet, and OPTIMAL-31, we demonstrate the feasibility of the proposed method and its superiority over the state-of-the-art scene classification methods.
ISSN:	2072-4292 2072-4292
DOI:	10.3390/rs14092042