Loading…

Advancing Multi-Class Semantic Segmentation of High-Resolution Satellite Imagery through Enhanced ASPP and Attention Mechanisms

Semantic segmentation is crucial for accurately understanding and monitoring Land-Use-Land-Cover (LULC) changes. To address challenges like ground object complexity and edge loss, we propose the MC-SegNext model. This model integrates a Convolutional Block Attention Module (CBAM), a multi-head self-...

Full description

Saved in:
Bibliographic Details
Main Authors: Srivastava, Noopur, Rai, Abhishek, Prasad Kushwaha, Sunni Kanta, Jain, Kamal
Format: Conference Proceeding
Language:English
Subjects:
Online Access:Request full text
Tags: Add Tag
No Tags, Be the first to tag this record!
Description
Summary:Semantic segmentation is crucial for accurately understanding and monitoring Land-Use-Land-Cover (LULC) changes. To address challenges like ground object complexity and edge loss, we propose the MC-SegNext model. This model integrates a Convolutional Block Attention Module (CBAM), a multi-head self-attention mechanism, and a proposed Sequential-Atrous Spatial Pyramid Pooling (S-ASPP) module into the existing DeepLabv3+ encoder-decoder framework. Unlike the parallel atrous convolutions in the original DeepLabv3+, the S-ASPP module sequentially combines their outputs to preserve individual contributions. The proposed S-ASPP strategy sequentially combines the influence of each atrous convolution to better retain the contribution of each atrous convolution. By combining S-ASPP technique and attention mechanisms, the model can effectively gather spatial information and enhance multi-scale features. Our model was tested on open datasets, including ISPRS Potsdam, ISPRS Vaihingen, and LoveDA Urban and Rural. It was compared with state-of-the-art models using metrics such as IoU, F1-score, and accuracy. Results showed that MCSegNext improved mean Intersection over Union (mIoU) by 4.51% on the ISPRS Potsdam dataset, 4.08% on the Vaihingen dataset, 4.41% on LoveDA Urban, and 4.68% on LoveDA Rural compared to DeepLabv3+.
ISSN:2153-7003
DOI:10.1109/IGARSS53475.2024.10640382