Loading…

Online Attention Accumulation for Weakly Supervised Semantic Segmentation

Object attention maps generated by image classifiers are usually used as priors for weakly supervised semantic segmentation. However, attention maps usually locate the most discriminative object parts. The lack of integral object localization maps heavily limits the performance of weakly supervised...

Full description

Saved in:
Bibliographic Details
Published in:IEEE transactions on pattern analysis and machine intelligence 2022-10, Vol.44 (10), p.7062-7077
Main Authors: Jiang, Peng-Tao, Han, Ling-Hao, Hou, Qibin, Cheng, Ming-Ming, Wei, Yunchao
Format: Article
Language:English
Subjects:
Citations: Items that this one cites
Items that cite this one
Online Access:Get full text
Tags: Add Tag
No Tags, Be the first to tag this record!
Description
Summary:Object attention maps generated by image classifiers are usually used as priors for weakly supervised semantic segmentation. However, attention maps usually locate the most discriminative object parts. The lack of integral object localization maps heavily limits the performance of weakly supervised segmentation approaches. This paper attempts to investigate a novel way to identify entire object regions in a weakly supervised manner. We observe that image classifiers' attention maps at different training phases may focus on different parts of the target objects. Based on this observation, we propose an online attention accumulation (OAA) strategy that utilizes the attention maps at different training phases to obtain more integral object regions. Specifically, we maintain a cumulative attention map for each target category in each training image and utilize it to record the discovered object regions at different training phases. Albeit OAA can effectively mine more object regions for most images, for some training images, the range of the attention movement is not large, limiting the generation of integral object attention regions. To overcome this problem, we propose incorporating an attention drop layer into the online attention accumulation process to enlarge the range of attention movement during training explicitly. Our method (OAA) can be plugged into any classification network and progressively accumulate the discriminative regions into cumulative attention maps as the training process goes. Additionally, we also explore utilizing the final cumulative attention maps to serve as the pixel-level supervision, which can further assist the network in discovering more integral object regions. When applying the resulting attention maps to the weakly supervised semantic segmentation task, our approach improves the existing state-of-the-art methods on the PASCAL VOC 2012 segmentation benchmark, achieving a mIoU score of 67.2 percent on the test set.
ISSN:0162-8828
2160-9292
1939-3539
DOI:10.1109/TPAMI.2021.3092573