Loading…

Images as sets of locally weighted features

► We propose to represent images bags of locally weighted features. ► A weight factor, obtained from a saliency map is associated to each local feature. ► We evaluated a several saliency map methods with our scheme for image categorisation. ► Maps obtained from eye gaze data lead to a significant pe...

Full description

Saved in:

Bibliographic Details
Published in:	Computer vision and image understanding 2012, Vol.116 (1), p.68-85
Main Authors:	de Campos, Teófilo, Csurka, Gabriela, Perronnin, Florent
Format:	Article
Language:	English
Subjects:	Applied sciences Artificial intelligence Bag-of-visual-words Computer science control theory systems Exact sciences and technology Feedback Houses Image categorisation Pattern recognition. Digital image processing. Computational geometry Performance enhancement Representations Saliency estimation Sampling Visual
Citations:	Items that this one cites Items that cite this one
Online Access:	Get full text
Tags:	Add Tag No Tags, Be the first to tag this record!

Description
Summary:	► We propose to represent images bags of locally weighted features. ► A weight factor, obtained from a saliency map is associated to each local feature. ► We evaluated a several saliency map methods with our scheme for image categorisation. ► Maps obtained from eye gaze data lead to a significant performance improvement. ► Simple maps, such as the central bias, also improved the categorisation performance. This paper presents a generic framework in which images are modelled as order-less sets of weighted visual features. Each visual feature is associated with a weight factor that may inform its relevance. This framework can be applied to various bag-of-features approaches such as the bag-of-visual-word or the Fisher kernel representations. We suggest that if dense sampling is used, different schemes to weight local features can be evaluated, leading to results that are often better than the combination of multiple sampling schemes, at a much lower computational cost, because the features are extracted only once. This allows our framework to be a test-bed for saliency estimation methods in image categorisation tasks. We explored two main possibilities for the estimation of local feature relevance. The first one is based on the use of saliency maps obtained from human feedback, either by gaze tracking or by mouse clicks. The method is able to profit from such maps, leading to a significant improvement in categorisation performance. The second possibility is based on automatic saliency estimation methods, including Itti & Koch’s method and SIFT’s DoG. We evaluated the proposed framework and saliency estimation methods using an in house dataset and the PASCAL VOC 2008/2007 dataset, showing that some of the saliency estimation methods lead to a significant performance improvement in comparison to the standard unweighted representation.
ISSN:	1077-3142 1090-235X
DOI:	10.1016/j.cviu.2011.07.011