Loading…

H2MaT-Unet:Hierarchical hybrid multi-axis transformer based Unet for medical image segmentation

Accurate segmentation and lesion localization are essential for treating diseases in medical images. Despite deep learning methods enhancing segmentation, they still have limitations due to convolutional neural networks’ inability to capture long-range feature dependencies. The self-attention mechan...

Full description

Saved in:

Bibliographic Details
Published in:	Computers in biology and medicine 2024-05, Vol.174, p.108387, Article 108387
Main Authors:	Ju, ZhiYong, Zhou, ZhongChen, Qi, ZiXiang, Yi, Cheng
Format:	Article
Language:	English
Subjects:	Algorithms Artificial neural networks Deep Learning Hierarchical hybrid multiaxial attention mechanism Humans Image enhancement Image processing Image Processing, Computer-Assisted - methods Image reconstruction Image resolution Image segmentation Localization Machine learning Medical image segmentation Medical imaging Medical treatment Multiaxis Neural networks Neural Networks, Computer Self-attention Spatial and channel reconstruction convolution Target recognition Transformers
Citations:	Items that this one cites
Online Access:	Get full text
Tags:	Add Tag No Tags, Be the first to tag this record!

Description
Summary:	Accurate segmentation and lesion localization are essential for treating diseases in medical images. Despite deep learning methods enhancing segmentation, they still have limitations due to convolutional neural networks’ inability to capture long-range feature dependencies. The self-attention mechanism in Transformers addresses this drawback, but high-resolution images present computational complexity. To improve the convolution and Transformer, we suggest a hierarchical hybrid multiaxial attention mechanism called H2MaT-Unet. This approach combines hierarchical post-feature data and applies the multiaxial attention mechanism to the feature interactions. This design facilitates efficient local and global interactions. Furthermore, we introduce a Spatial and Channel Reconstruction Convolution (ScConv) module to enhance feature aggregation. The paper introduces the H2MaT-UNet model which achieves 87.74% Dice in the multi-target segmentation task and 87.88% IOU in the single-target segmentation task, surpassing current popular models and accomplish a new SOTA. H2MaT-UNet synthesizes multi-scale feature information during the layering stage and utilizes a multi-axis attention mechanism to amplify global information interactions in an innovative manner. This re-search holds value for the practical application of deep learning in clinical settings. It allows healthcare providers to analyze segmented details of medical images more quickly and accurately. •Adv. Feature Rep.: H2MaT-UNet combines attention with U-shaped structure for superior med. image seg.•MbConv & MSCA: MSCA branch, with channel attention and MbConv, excels in diverse med. image seg. challenges.•Scalable ScConv: ScConv module enhances model generalization, reduces redundancy, and improves image seg. accuracy.•SOTA Perf.: H2MaT-UNet excels in skin lesion and Synapse Multi-Organ Segmentation, confirming its competitiveness.
ISSN:	0010-4825 1879-0534 1879-0534
DOI:	10.1016/j.compbiomed.2024.108387