Loading…

Decoupling and Integration Network for Camouflaged Object Detection

Recently, camouflaged object detection (COD), which suffers from numerous challenges such as low contrast between camouflaged objects and background and large variations of camouflaged object appearances, has received more and more concerns. However, the performance of existing camouflaged object de...

Full description

Saved in:
Bibliographic Details
Published in:IEEE transactions on multimedia 2024, Vol.26, p.7114-7129
Main Authors: Zhou, Xiaofei, Wu, Zhicong, Cong, Runmin
Format: Article
Language:English
Subjects:
Citations: Items that this one cites
Online Access:Get full text
Tags: Add Tag
No Tags, Be the first to tag this record!
Description
Summary:Recently, camouflaged object detection (COD), which suffers from numerous challenges such as low contrast between camouflaged objects and background and large variations of camouflaged object appearances, has received more and more concerns. However, the performance of existing camouflaged object detection methods is still unsatisfactory, especially when dealing with complex scenes. Therefore, in this article, we propose a novel Decoupling and Integration Network (DINet) to detect camouflaged objects. Here, the depiction of camouflaged objects can be regarded as the iterative decoupling and integration of the body features and detail features, where the former focuses on the center of camouflaged objects and the latter contains pixels around edges. Concretely, firstly, we deploy two complementary decoder branches including a detail branch and a body branch to learn the decoupling features, namely body decoder features and detail decoder features. Particularly, each decoder block of the two branches incorporates features from three components, i.e., the previous interactive feature fusion (IFF) module, adjacent encoder layers, and corresponding encoder layer. Besides, to further elevate the body decoder features, the body blocks also introduce the global contextual information, which is the combination of all body encoder features via the global context (GC) unit, to provide coarse object location information. Secondly, to integrate the two decoupling decoder features, we deploy the interactive feature fusion (IFF) module based on the interactive combination and channel attention. Following this way, we can progressively provide a complete and accurate representation for camouflaged objects. Extensive experiments on three public challenging datasets, including CAMO, COD10 K, and NC4K, show that our DINet presents competitive performance when compared with the state-of-the-art models.
ISSN:1520-9210
1941-0077
DOI:10.1109/TMM.2024.3360710