Loading…
Transformer-based difference fusion network for RGB-D salient object detection
RGB-D salient object detection (SOD) can usually be divided into three stages: feature extraction, feature fusion, and feature prediction. Most approaches treat the feature information extracted by the backbone network identically in the final two stages of detection, neglecting the fact that variou...
Saved in:
Published in: | Journal of electronic imaging 2022-11, Vol.31 (6), p.063058-063058 |
---|---|
Main Authors: | , , |
Format: | Article |
Language: | English |
Citations: | Items that this one cites |
Online Access: | Get full text |
Tags: |
Add Tag
No Tags, Be the first to tag this record!
|
Summary: | RGB-D salient object detection (SOD) can usually be divided into three stages: feature extraction, feature fusion, and feature prediction. Most approaches treat the feature information extracted by the backbone network identically in the final two stages of detection, neglecting the fact that various modalities and different hierarchical features play distinct roles in SOD, resulting in poor detection results. To solve this problem, we propose a transformer-based difference fusion network (TDF-Net) for RGB-D SOD that treats modal features and hierarchical features differently in the feature fusion and feature prediction stages, respectively. First, we adopt the pyramid vision transformer as a feature extractor to obtain hierarchical features from the input RGB images and depth images, respectively. Second, we propose a differential interactive fusion module, in which the RGB modality and the depth modality learn modality-specific features independently, and the two modalities guide each other to fuse features. Finally, we divide the hierarchical features after cross-modal fusion into high-level and low-level features and propose three types of cross-layer fusion modules to discriminately integrate features from different layers to predict the salient maps. Extensive experiments on five benchmark datasets confirm that our proposed TDF-Net outperforms the state-of-the-art methods. |
---|---|
ISSN: | 1017-9909 1560-229X |
DOI: | 10.1117/1.JEI.31.6.063058 |