Loading…

A lightweight network for abdominal multi-organ segmentation based on multi-scale context fusion and dual self-attention

Segmenting the organs from abdominal CT images is a vital procedure for computer-aided diagnosis and treatment. Accurate and simultaneous segmentation of multiple abdominal organs remains challenging due to the complex structures, varying sizes, and fuzzy boundaries. Currently, most methods aiming a...

Full description

Saved in:

Bibliographic Details
Published in:	Information fusion 2024-08, Vol.108, p.102401, Article 102401
Main Authors:	Liao, Miao, Tang, Hongliang, Li, Xiong, Vijayakumar, P., Arya, Varsha, Gupta, Brij B.
Format:	Article
Language:	English
Subjects:	Context-aware CT image Feature fusion Segmentation Self-attention
Citations:	Items that this one cites
Online Access:	Get full text
Tags:	Add Tag No Tags, Be the first to tag this record!

Description
Summary:	Segmenting the organs from abdominal CT images is a vital procedure for computer-aided diagnosis and treatment. Accurate and simultaneous segmentation of multiple abdominal organs remains challenging due to the complex structures, varying sizes, and fuzzy boundaries. Currently, most methods aiming at improving segmentation accuracy involve either deepening the network or employing large-scale models, which results in a heavy computation burden and a huge number of model parameters. It is difficult to deploy these methods in a medical environment. In this paper, we present a lightweight network based on multi-scale context fusion and dual self-attention. The dual self-attention mechanism is used to obtain target organ responses from channel domain, while also strengthening the correlation of global information from spatial domain. Considering the complex structure of abdominal organs, we design a multi-scale context fusion module comprised of a pyramid pooling (PP) and an anisotropic strip pooling (ASP). The PP is used to acquire rich local features by aggregating context information from different receptive fields, while the ASP is designed to extract strip features in different directions to help the network establish long-distance dependencies and capture the characteristics of elongated organs, such as pancreas and spleen. Moreover, a residual module is introduced in the skip connection to learn features related to edges and small objects. The proposed method achieves averaged Dice of 90.1% and 82.5% on the FLARE and BTCV datasets, respectively, with only 6.25M model parameters and 21.40G FLOPs, outperforming many state-of-the-art methods. •A multi-scale context fusion module is proposed to extract local and global features.•Anisotropic strip pooling is designed to improve the accuracy of strip-shaped organs.•A dual self-attention module is proposed to establish global information connections.•A residual module is employed to compensate for information loss in down-sampling.•Our model shows obvious advantages in computational cost and segmentation accuracy.
ISSN:	1566-2535 1872-6305
DOI:	10.1016/j.inffus.2024.102401