Loading…

Learning Consistency from High-confidence Pseudo-labels for Weakly Supervised Object Localization

Weakly supervised object localization (WSOL) tasks aim to classify and locate a single object under the supervision of only image-level labels. Pseudo-supervised learning methods have been shown to be effective for WSOL, which divided WSOL tasks into two decoupled subtasks: classification and locali...

Full description

Saved in:
Bibliographic Details
Published in:IEEE access 2023-01, Vol.11, p.1-1
Main Authors: Sun, Kangbo, Zhu, Jie
Format: Article
Language:English
Subjects:
Citations: Items that this one cites
Online Access:Get full text
Tags: Add Tag
No Tags, Be the first to tag this record!
Description
Summary:Weakly supervised object localization (WSOL) tasks aim to classify and locate a single object under the supervision of only image-level labels. Pseudo-supervised learning methods have been shown to be effective for WSOL, which divided WSOL tasks into two decoupled subtasks: classification and localization. The decoupled framework has been proven to be effective in improving the performance of the localization subtask, but the predicted localizations are not robust enough due to the noise of pseudo-labels. Based on the assumption that the localization model should have similar predictions on different versions of the same image, we propose an additional refinement stage to learn more consistent localization. Specifically, in the refinement stage, we propose a simple and effective method for evaluating the confidence of pseudo-labels based on classification discrimination, and by learning consistency from high-confidence pseudo-labels, we further refine the localization model to get better localization performance. Besides, in the initialization stage, we propose a mask-based pseudo-label generator to initialize the localization model. We conduct experiments on two benchmark datasets: CUB-200-2011 and ImageNet-1k. Experimental results show that our two-stage approach achieves 94.01% GT-Konwn localization accuracy on the CUB-200-2011 testing dataset, and 65.23% GT-Konwn localization accuracy on the ImageNet-1k validation dataset. Moreover, when directly applied to the pseudo-supervised localization model, our refinement stage could achieve 94.05% and 67.13% GT-Konwn localization accuracy on CUB-200-2011 and ImageNet-1k datasets, respectively, which outperforms the corresponding pseudo-supervised localization model with 3.34% and 2.34% accuracy.
ISSN:2169-3536
2169-3536
DOI:10.1109/ACCESS.2023.3246259