Loading…

End-to-end semantic-aware object retrieval based on region-wise attention

Image representations based on pre-trained Convolutional Neural Networks (CNNs) have achieved the new state of the art in computer vision tasks such as object retrieval. Such methods usually encode the activations of convolutional layers to produce highly competitive global or local representations,...

Full description

Saved in:
Bibliographic Details
Published in:Neurocomputing (Amsterdam) 2019-09, Vol.359, p.219-226
Main Authors: Li, Xiu, Jin, Kun, Long, Rujiao
Format: Article
Language:English
Subjects:
Citations: Items that this one cites
Items that cite this one
Online Access:Get full text
Tags: Add Tag
No Tags, Be the first to tag this record!
Description
Summary:Image representations based on pre-trained Convolutional Neural Networks (CNNs) have achieved the new state of the art in computer vision tasks such as object retrieval. Such methods usually encode the activations of convolutional layers to produce highly competitive global or local representations, as they contain the spatial information of the input image. In this work, we propose the region-wise attention mechanism to generate a semantic-aware encoding of convolutional features by two different methods. One is to re-weight the convolutional features according to the pixel-wise label from the semantic segmentation CNNs, and the other is to design a spatial attention block that adaptively recalibrates region-wise weights by explicitly modelling interdependencies between channels. We further build an end-to-end semantic-aware object retrieval pipeline based on off-the-shelf models and assess the performance of our proposed approach on the public available datasets Oxford5k and Paris6k, including large-scale datasets Oxford105k and Paris106k. As a result, we significantly improve the current state of the art.
ISSN:0925-2312
1872-8286
DOI:10.1016/j.neucom.2019.06.008