Loading…

Clustering-guided Class Activation for Weakly Supervised Semantic Segmentation

Weakly-supervised semantic segmentation (WSSS) methods via transformer have been actively studied by leveraging their strong capability to capture the global context. However, since the activation function only highlights a few tokens in the self-attention mechanism of the transformer, these methods...

Full description

Saved in:
Bibliographic Details
Published in:IEEE access 2024-01, Vol.12, p.1-1
Main Authors: Kim, Yeong Woo, Kim, Wonjun
Format: Article
Language:English
Subjects:
Citations: Items that this one cites
Online Access:Get full text
Tags: Add Tag
No Tags, Be the first to tag this record!
Description
Summary:Weakly-supervised semantic segmentation (WSSS) methods via transformer have been actively studied by leveraging their strong capability to capture the global context. However, since the activation function only highlights a few tokens in the self-attention mechanism of the transformer, these methods still suffer from the sparse attention map, which leads to the generation of incomplete pseudo labels. In this paper, we propose a novel class activation scheme that is able to uniformly highlight the whole object region. The key idea of the proposed method is to activate the object region by following the guide of clusters, which are formed by combining similar image features of the object. Specifically, the clustering-guided class activation map (ClusterCAM) is generated from the proposed clustering-based attention module, and highly responsive regions in this map are then adopted to activate target objects in the encoded feature space. This helps the model to explore the entire region of the target object by using the semantic proximity between patch tokens extracted from the same object. Based on this, we design an end-to-end WSSS framework that can simultaneously train classification and segmentation networks in a single-stage manner. Experimental results on benchmark datasets show that our proposed method significantly outperforms previous WSSS methods, including several multi-stage approaches. The code and model are publicly available at: https://github.com/DCVL-WSSS/ClusterCAM.
ISSN:2169-3536
2169-3536
DOI:10.1109/ACCESS.2024.3350176