Loading…
HTCViT: an effective network for image classification and segmentation based on natural disaster datasets
Classifying and segmenting natural disaster images are crucial for predicting and responding to disasters. However, current convolutional networks perform poorly in processing natural disaster images, and there are few proprietary networks for this task. To address the varying scales of the region o...
Saved in:
Published in: | The Visual computer 2023-08, Vol.39 (8), p.3285-3297 |
---|---|
Main Authors: | , , , , , |
Format: | Article |
Language: | English |
Subjects: | |
Citations: | Items that this one cites |
Online Access: | Get full text |
Tags: |
Add Tag
No Tags, Be the first to tag this record!
|
Summary: | Classifying and segmenting natural disaster images are crucial for predicting and responding to disasters. However, current convolutional networks perform poorly in processing natural disaster images, and there are few proprietary networks for this task. To address the varying scales of the region of interest (ROI) in these images, we propose the Hierarchical TSAM-CB-ViT (HTCViT) network, which builds on the ViT network’s attention mechanism to better process natural disaster images. Considering that ViT excels at extracting global context but struggles with local features, our method combines the strengths of ViT and convolution, and can capture overall contextual information within each patch using the Triple-Strip Attention Mechanism (TSAM) structure. Experiments validate that our HTCViT can improve the classification task with
3
-
4
%
and the segmentation task with
1
-
2
%
on natural disaster datasets compared to the vanilla ViT network. |
---|---|
ISSN: | 0178-2789 1432-2315 |
DOI: | 10.1007/s00371-023-02954-3 |