Loading…
Weakly Supervised Object Detection via Object-Specific Pixel Gradient
Most existing object detection algorithms are trained based upon a set of fully annotated object regions or bounding boxes, which are typically labor-intensive. On the contrary, nowadays there is a significant amount of image-level annotations cheaply available on the Internet. It is hence a natural...
Saved in:
Published in: | IEEE transaction on neural networks and learning systems 2018-12, Vol.29 (12), p.5960-5970 |
---|---|
Main Authors: | , , , , |
Format: | Article |
Language: | English |
Subjects: | |
Citations: | Items that this one cites Items that cite this one |
Online Access: | Get full text |
Tags: |
Add Tag
No Tags, Be the first to tag this record!
|
Summary: | Most existing object detection algorithms are trained based upon a set of fully annotated object regions or bounding boxes, which are typically labor-intensive. On the contrary, nowadays there is a significant amount of image-level annotations cheaply available on the Internet. It is hence a natural thought to explore such "weak" supervision to benefit the training of object detectors. In this paper, we propose a novel scheme to perform weakly supervised object localization, termed object-specific pixel gradient (OPG). The OPG is trained by using image-level annotations alone, which performs in an iterative manner to localize potential objects in a given image robustly and efficiently. In particular, we first extract an OPG map to reveal the contributions of individual pixels to a given object category, upon which an iterative mining scheme is further introduced to extract instances or components of this object. Moreover, a novel average and max pooling layer is introduced to improve the localization accuracy. In the task of weakly supervised object localization, the OPG achieves a state-of-the-art 44.5% top-5 error on ILSVRC 2013, which outperforms competing methods, including Oquab et al. and region-based convolutional neural networks on the Pascal VOC 2012, with gains of 2.6% and 2.3%, respectively. In the task of object detection, OPG achieves a comparable performance of 27.0% mean average precision on Pascal VOC 2007. In all experiments, the OPG only adopts the off-the-shelf pretrained CNN model, without using any object proposals. Therefore, it also significantly improves the detection speed, i.e., achieving three times faster compared with the state-of-the-art method. |
---|---|
ISSN: | 2162-237X 2162-2388 |
DOI: | 10.1109/TNNLS.2018.2816021 |