Loading…

HTCViT: an effective network for image classification and segmentation based on natural disaster datasets

Classifying and segmenting natural disaster images are crucial for predicting and responding to disasters. However, current convolutional networks perform poorly in processing natural disaster images, and there are few proprietary networks for this task. To address the varying scales of the region o...

Full description

Saved in:
Bibliographic Details
Published in:The Visual computer 2023-08, Vol.39 (8), p.3285-3297
Main Authors: Ma, Zhihao, Li, Wei, Zhang, Muyang, Meng, Weiliang, Xu, Shibiao, Zhang, Xiaopeng
Format: Article
Language:English
Subjects:
Citations: Items that this one cites
Online Access:Get full text
Tags: Add Tag
No Tags, Be the first to tag this record!
Description
Summary:Classifying and segmenting natural disaster images are crucial for predicting and responding to disasters. However, current convolutional networks perform poorly in processing natural disaster images, and there are few proprietary networks for this task. To address the varying scales of the region of interest (ROI) in these images, we propose the Hierarchical TSAM-CB-ViT (HTCViT) network, which builds on the ViT network’s attention mechanism to better process natural disaster images. Considering that ViT excels at extracting global context but struggles with local features, our method combines the strengths of ViT and convolution, and can capture overall contextual information within each patch using the Triple-Strip Attention Mechanism (TSAM) structure. Experiments validate that our HTCViT can improve the classification task with 3 - 4 % and the segmentation task with 1 - 2 % on natural disaster datasets compared to the vanilla ViT network.
ISSN:0178-2789
1432-2315
DOI:10.1007/s00371-023-02954-3