Loading…

HEFANet: hierarchical efficient fusion and aggregation segmentation network for enhanced rgb-thermal urban scene parsing

RGB-Thermal semantic segmentation is important in widespread applications in adverse illumination conditions, such as autonomous driving and robotic sensing. However, most existing methods ignore the feature differences between the two modalities and do not effectively exploit and handle the feature...

Full description

Saved in:
Bibliographic Details
Published in:Applied intelligence (Dordrecht, Netherlands) Netherlands), 2024-11, Vol.54 (22), p.11248-11266
Main Authors: Shen, Zhengwen, Pan, Zaiyu, Weng, Yuchen, Li, Yulian, Wang, Jiangyu, Wang, Jun
Format: Article
Language:English
Subjects:
Citations: Items that this one cites
Online Access:Get full text
Tags: Add Tag
No Tags, Be the first to tag this record!
Description
Summary:RGB-Thermal semantic segmentation is important in widespread applications in adverse illumination conditions, such as autonomous driving and robotic sensing. However, most existing methods ignore the feature differences between the two modalities and do not effectively exploit and handle the features at different levels. In this paper, we present a novel multimodal feature fusion network named HEFANet, which effectively enhances the interaction and fusion of features. Concretely, we propose a Cross-layer and Cross-modal Feature Descriptor module (CCFD) to mitigate differences between different multimodal data and to mine the valuable and correlated features of cross-layers. To effectively fuse multimodal features at different levels, we propose a Multi-modal Interleaved Sparse Self-Attention module (MISSA) to aggregate rich spatial semantic information in the earlier layers. Then, we propose the Spatial Interaction and Channel Selection module (SICS) in the last layer to enhance the representation of rich contextual features and highlight important information by channel communication interactions for optimal sparse feature aggregation selectively. Extensive experiments were carried out on three publicly available datasets (MFNet, PST900, and FMB), and achieved new state-of-the-art results. The code and results are available at https://github.com/shenzw21/HEFANet .
ISSN:0924-669X
1573-7497
DOI:10.1007/s10489-024-05743-0