Loading…

HTCViT: an effective network for image classification and segmentation based on natural disaster datasets

Classifying and segmenting natural disaster images are crucial for predicting and responding to disasters. However, current convolutional networks perform poorly in processing natural disaster images, and there are few proprietary networks for this task. To address the varying scales of the region o...

Full description

Saved in:

Bibliographic Details
Published in:	The Visual computer 2023-08, Vol.39 (8), p.3285-3297
Main Authors:	Ma, Zhihao, Li, Wei, Zhang, Muyang, Meng, Weiliang, Xu, Shibiao, Zhang, Xiaopeng
Format:	Article
Language:	English
Subjects:	Artificial Intelligence Classification Computer Graphics Computer Science Datasets Disasters Image classification Image Processing and Computer Vision Image segmentation Natural disasters Neural networks Original Article
Citations:	Items that this one cites
Online Access:	Get full text
Tags:	Add Tag No Tags, Be the first to tag this record!

cited_by
cites	cdi_FETCH-LOGICAL-c270t-8b299b7f885231f17f415b3088dbafd3e76ee1535d395b3ab28fb97ae444284a3
container_end_page	3297
container_issue	8
container_start_page	3285
container_title	The Visual computer
container_volume	39
creator	Ma, Zhihao Li, Wei Zhang, Muyang Meng, Weiliang Xu, Shibiao Zhang, Xiaopeng
description	Classifying and segmenting natural disaster images are crucial for predicting and responding to disasters. However, current convolutional networks perform poorly in processing natural disaster images, and there are few proprietary networks for this task. To address the varying scales of the region of interest (ROI) in these images, we propose the Hierarchical TSAM-CB-ViT (HTCViT) network, which builds on the ViT network’s attention mechanism to better process natural disaster images. Considering that ViT excels at extracting global context but struggles with local features, our method combines the strengths of ViT and convolution, and can capture overall contextual information within each patch using the Triple-Strip Attention Mechanism (TSAM) structure. Experiments validate that our HTCViT can improve the classification task with 3 - 4 % and the segmentation task with 1 - 2 % on natural disaster datasets compared to the vanilla ViT network.
doi_str_mv	10.1007/s00371-023-02954-3
format	article
fullrecord	<record><control><sourceid>proquest_cross</sourceid><recordid>TN_cdi_proquest_journals_2917963484</recordid><sourceformat>XML</sourceformat><sourcesystem>PC</sourcesystem><sourcerecordid>2917963484</sourcerecordid><originalsourceid>FETCH-LOGICAL-c270t-8b299b7f885231f17f415b3088dbafd3e76ee1535d395b3ab28fb97ae444284a3</originalsourceid><addsrcrecordid>eNp9UMtKAzEUDaJgrf6Aq4Dr0bymSdxJ8QUFN9VtyMzclNR2puamin9vdAR3Li73dc59HELOObvkjOkrZExqXjEhi9laVfKATLiSohKS14dkwrg2ldDGHpMTxDUruVZ2QuLDcv4Sl9fU9xRCgDbHd6A95I8hvdIwJBq3fgW03XjEGGLrcxz6gu4owmoLfR4LjUfoaAl6n_fJb2gX0WOGRDufSy_jKTkKfoNw9uun5Pnudjl_qBZP94_zm0XVCs1yZRphbaODMXU5PXAdFK8byYzpGh86CXoGwGtZd9KWum-ECY3VHpRSwigvp-RinLtLw9seMLv1sE99WemE5drOpDKqoMSIatOAmCC4XSqfpk_HmfuW1I2SuiKp-5HUyUKSIwkLuF9B-hv9D-sLeRN6Ig</addsrcrecordid><sourcetype>Aggregation Database</sourcetype><iscdi>true</iscdi><recordtype>article</recordtype><pqid>2917963484</pqid></control><display><type>article</type><title>HTCViT: an effective network for image classification and segmentation based on natural disaster datasets</title><source>Springer Link</source><creator>Ma, Zhihao ; Li, Wei ; Zhang, Muyang ; Meng, Weiliang ; Xu, Shibiao ; Zhang, Xiaopeng</creator><creatorcontrib>Ma, Zhihao ; Li, Wei ; Zhang, Muyang ; Meng, Weiliang ; Xu, Shibiao ; Zhang, Xiaopeng</creatorcontrib><description>Classifying and segmenting natural disaster images are crucial for predicting and responding to disasters. However, current convolutional networks perform poorly in processing natural disaster images, and there are few proprietary networks for this task. To address the varying scales of the region of interest (ROI) in these images, we propose the Hierarchical TSAM-CB-ViT (HTCViT) network, which builds on the ViT network’s attention mechanism to better process natural disaster images. Considering that ViT excels at extracting global context but struggles with local features, our method combines the strengths of ViT and convolution, and can capture overall contextual information within each patch using the Triple-Strip Attention Mechanism (TSAM) structure. Experiments validate that our HTCViT can improve the classification task with 3 - 4 % and the segmentation task with 1 - 2 % on natural disaster datasets compared to the vanilla ViT network.</description><identifier>ISSN: 0178-2789</identifier><identifier>EISSN: 1432-2315</identifier><identifier>DOI: 10.1007/s00371-023-02954-3</identifier><language>eng</language><publisher>Berlin/Heidelberg: Springer Berlin Heidelberg</publisher><subject>Artificial Intelligence ; Classification ; Computer Graphics ; Computer Science ; Datasets ; Disasters ; Image classification ; Image Processing and Computer Vision ; Image segmentation ; Natural disasters ; Neural networks ; Original Article</subject><ispartof>The Visual computer, 2023-08, Vol.39 (8), p.3285-3297</ispartof><rights>The Author(s), under exclusive licence to Springer-Verlag GmbH Germany, part of Springer Nature 2023. Springer Nature or its licensor (e.g. a society or other partner) holds exclusive rights to this article under a publishing agreement with the author(s) or other rightsholder(s); author self-archiving of the accepted manuscript version of this article is solely governed by the terms of such publishing agreement and applicable law.</rights><lds50>peer_reviewed</lds50><woscitedreferencessubscribed>false</woscitedreferencessubscribed><cites>FETCH-LOGICAL-c270t-8b299b7f885231f17f415b3088dbafd3e76ee1535d395b3ab28fb97ae444284a3</cites><orcidid>0000-0002-3221-4981</orcidid></display><links><openurl>$$Topenurl_article</openurl><openurlfulltext>$$Topenurlfull_article</openurlfulltext><thumbnail>$$Tsyndetics_thumb_exl</thumbnail><link.rule.ids>314,776,780,27901,27902</link.rule.ids></links><search><creatorcontrib>Ma, Zhihao</creatorcontrib><creatorcontrib>Li, Wei</creatorcontrib><creatorcontrib>Zhang, Muyang</creatorcontrib><creatorcontrib>Meng, Weiliang</creatorcontrib><creatorcontrib>Xu, Shibiao</creatorcontrib><creatorcontrib>Zhang, Xiaopeng</creatorcontrib><title>HTCViT: an effective network for image classification and segmentation based on natural disaster datasets</title><title>The Visual computer</title><addtitle>Vis Comput</addtitle><description>Classifying and segmenting natural disaster images are crucial for predicting and responding to disasters. However, current convolutional networks perform poorly in processing natural disaster images, and there are few proprietary networks for this task. To address the varying scales of the region of interest (ROI) in these images, we propose the Hierarchical TSAM-CB-ViT (HTCViT) network, which builds on the ViT network’s attention mechanism to better process natural disaster images. Considering that ViT excels at extracting global context but struggles with local features, our method combines the strengths of ViT and convolution, and can capture overall contextual information within each patch using the Triple-Strip Attention Mechanism (TSAM) structure. Experiments validate that our HTCViT can improve the classification task with 3 - 4 % and the segmentation task with 1 - 2 % on natural disaster datasets compared to the vanilla ViT network.</description><subject>Artificial Intelligence</subject><subject>Classification</subject><subject>Computer Graphics</subject><subject>Computer Science</subject><subject>Datasets</subject><subject>Disasters</subject><subject>Image classification</subject><subject>Image Processing and Computer Vision</subject><subject>Image segmentation</subject><subject>Natural disasters</subject><subject>Neural networks</subject><subject>Original Article</subject><issn>0178-2789</issn><issn>1432-2315</issn><fulltext>true</fulltext><rsrctype>article</rsrctype><creationdate>2023</creationdate><recordtype>article</recordtype><recordid>eNp9UMtKAzEUDaJgrf6Aq4Dr0bymSdxJ8QUFN9VtyMzclNR2puamin9vdAR3Li73dc59HELOObvkjOkrZExqXjEhi9laVfKATLiSohKS14dkwrg2ldDGHpMTxDUruVZ2QuLDcv4Sl9fU9xRCgDbHd6A95I8hvdIwJBq3fgW03XjEGGLrcxz6gu4owmoLfR4LjUfoaAl6n_fJb2gX0WOGRDufSy_jKTkKfoNw9uun5Pnudjl_qBZP94_zm0XVCs1yZRphbaODMXU5PXAdFK8byYzpGh86CXoGwGtZd9KWum-ECY3VHpRSwigvp-RinLtLw9seMLv1sE99WemE5drOpDKqoMSIatOAmCC4XSqfpk_HmfuW1I2SuiKp-5HUyUKSIwkLuF9B-hv9D-sLeRN6Ig</recordid><startdate>20230801</startdate><enddate>20230801</enddate><creator>Ma, Zhihao</creator><creator>Li, Wei</creator><creator>Zhang, Muyang</creator><creator>Meng, Weiliang</creator><creator>Xu, Shibiao</creator><creator>Zhang, Xiaopeng</creator><general>Springer Berlin Heidelberg</general><general>Springer Nature B.V</general><scope>AAYXX</scope><scope>CITATION</scope><scope>8FE</scope><scope>8FG</scope><scope>AFKRA</scope><scope>ARAPS</scope><scope>AZQEC</scope><scope>BENPR</scope><scope>BGLVJ</scope><scope>CCPQU</scope><scope>DWQXO</scope><scope>GNUQQ</scope><scope>HCIFZ</scope><scope>JQ2</scope><scope>K7-</scope><scope>P5Z</scope><scope>P62</scope><scope>PQEST</scope><scope>PQQKQ</scope><scope>PQUKI</scope><orcidid>https://orcid.org/0000-0002-3221-4981</orcidid></search><sort><creationdate>20230801</creationdate><title>HTCViT: an effective network for image classification and segmentation based on natural disaster datasets</title><author>Ma, Zhihao ; Li, Wei ; Zhang, Muyang ; Meng, Weiliang ; Xu, Shibiao ; Zhang, Xiaopeng</author></sort><facets><frbrtype>5</frbrtype><frbrgroupid>cdi_FETCH-LOGICAL-c270t-8b299b7f885231f17f415b3088dbafd3e76ee1535d395b3ab28fb97ae444284a3</frbrgroupid><rsrctype>articles</rsrctype><prefilter>articles</prefilter><language>eng</language><creationdate>2023</creationdate><topic>Artificial Intelligence</topic><topic>Classification</topic><topic>Computer Graphics</topic><topic>Computer Science</topic><topic>Datasets</topic><topic>Disasters</topic><topic>Image classification</topic><topic>Image Processing and Computer Vision</topic><topic>Image segmentation</topic><topic>Natural disasters</topic><topic>Neural networks</topic><topic>Original Article</topic><toplevel>peer_reviewed</toplevel><toplevel>online_resources</toplevel><creatorcontrib>Ma, Zhihao</creatorcontrib><creatorcontrib>Li, Wei</creatorcontrib><creatorcontrib>Zhang, Muyang</creatorcontrib><creatorcontrib>Meng, Weiliang</creatorcontrib><creatorcontrib>Xu, Shibiao</creatorcontrib><creatorcontrib>Zhang, Xiaopeng</creatorcontrib><collection>CrossRef</collection><collection>ProQuest SciTech Collection</collection><collection>ProQuest Technology Collection</collection><collection>ProQuest Central UK/Ireland</collection><collection>Advanced Technologies & Aerospace Collection</collection><collection>ProQuest Central Essentials</collection><collection>ProQuest Central</collection><collection>Technology Collection</collection><collection>ProQuest One Community College</collection><collection>ProQuest Central Korea</collection><collection>ProQuest Central Student</collection><collection>SciTech Premium Collection</collection><collection>ProQuest Computer Science Collection</collection><collection>Computer Science Database</collection><collection>Advanced Technologies & Aerospace Database</collection><collection>ProQuest Advanced Technologies & Aerospace Collection</collection><collection>ProQuest One Academic Eastern Edition (DO NOT USE)</collection><collection>ProQuest One Academic</collection><collection>ProQuest One Academic UKI Edition</collection><jtitle>The Visual computer</jtitle></facets><delivery><delcategory>Remote Search Resource</delcategory><fulltext>fulltext</fulltext></delivery><addata><au>Ma, Zhihao</au><au>Li, Wei</au><au>Zhang, Muyang</au><au>Meng, Weiliang</au><au>Xu, Shibiao</au><au>Zhang, Xiaopeng</au><format>journal</format><genre>article</genre><ristype>JOUR</ristype><atitle>HTCViT: an effective network for image classification and segmentation based on natural disaster datasets</atitle><jtitle>The Visual computer</jtitle><stitle>Vis Comput</stitle><date>2023-08-01</date><risdate>2023</risdate><volume>39</volume><issue>8</issue><spage>3285</spage><epage>3297</epage><pages>3285-3297</pages><issn>0178-2789</issn><eissn>1432-2315</eissn><abstract>Classifying and segmenting natural disaster images are crucial for predicting and responding to disasters. However, current convolutional networks perform poorly in processing natural disaster images, and there are few proprietary networks for this task. To address the varying scales of the region of interest (ROI) in these images, we propose the Hierarchical TSAM-CB-ViT (HTCViT) network, which builds on the ViT network’s attention mechanism to better process natural disaster images. Considering that ViT excels at extracting global context but struggles with local features, our method combines the strengths of ViT and convolution, and can capture overall contextual information within each patch using the Triple-Strip Attention Mechanism (TSAM) structure. Experiments validate that our HTCViT can improve the classification task with 3 - 4 % and the segmentation task with 1 - 2 % on natural disaster datasets compared to the vanilla ViT network.</abstract><cop>Berlin/Heidelberg</cop><pub>Springer Berlin Heidelberg</pub><doi>10.1007/s00371-023-02954-3</doi><tpages>13</tpages><orcidid>https://orcid.org/0000-0002-3221-4981</orcidid></addata></record>
fulltext	fulltext
identifier	ISSN: 0178-2789
ispartof	The Visual computer, 2023-08, Vol.39 (8), p.3285-3297
issn	0178-2789 1432-2315
language	eng
recordid	cdi_proquest_journals_2917963484
source	Springer Link
subjects	Artificial Intelligence Classification Computer Graphics Computer Science Datasets Disasters Image classification Image Processing and Computer Vision Image segmentation Natural disasters Neural networks Original Article
title	HTCViT: an effective network for image classification and segmentation based on natural disaster datasets
url	http://sfxeu10.hosted.exlibrisgroup.com/loughborough?ctx_ver=Z39.88-2004&ctx_enc=info:ofi/enc:UTF-8&ctx_tim=2025-02-07T08%3A06%3A55IST&url_ver=Z39.88-2004&url_ctx_fmt=infofi/fmt:kev:mtx:ctx&rfr_id=info:sid/primo.exlibrisgroup.com:primo3-Article-proquest_cross&rft_val_fmt=info:ofi/fmt:kev:mtx:journal&rft.genre=article&rft.atitle=HTCViT:%20an%20effective%20network%20for%20image%20classification%20and%20segmentation%20based%20on%20natural%20disaster%20datasets&rft.jtitle=The%20Visual%20computer&rft.au=Ma,%20Zhihao&rft.date=2023-08-01&rft.volume=39&rft.issue=8&rft.spage=3285&rft.epage=3297&rft.pages=3285-3297&rft.issn=0178-2789&rft.eissn=1432-2315&rft_id=info:doi/10.1007/s00371-023-02954-3&rft_dat=%3Cproquest_cross%3E2917963484%3C/proquest_cross%3E%3Cgrp_id%3Ecdi_FETCH-LOGICAL-c270t-8b299b7f885231f17f415b3088dbafd3e76ee1535d395b3ab28fb97ae444284a3%3C/grp_id%3E%3Coa%3E%3C/oa%3E%3Curl%3E%3C/url%3E&rft_id=info:oai/&rft_pqid=2917963484&rft_id=info:pmid/&rfr_iscdi=true