Loading…

Generating Image Descriptions Using Semantic Similarities in the Output Space

Automatically generating meaningful descriptions for images has recently emerged as an important area of research. In this direction, a nearest-neighbour based generative phrase prediction model (PPM) proposed by (Gupta et al. 2012) was shown to achieve state-of-the-art results on PASCAL sentence da...

Full description

Saved in:
Bibliographic Details
Main Authors: Verma, Yashaswi, Gupta, Ankush, Mannem, Prashanth, Jawahar, C. V.
Format: Conference Proceeding
Language:English
Subjects:
Online Access:Request full text
Tags: Add Tag
No Tags, Be the first to tag this record!
Description
Summary:Automatically generating meaningful descriptions for images has recently emerged as an important area of research. In this direction, a nearest-neighbour based generative phrase prediction model (PPM) proposed by (Gupta et al. 2012) was shown to achieve state-of-the-art results on PASCAL sentence dataset, thanks to the simultaneous use of three different sources of information (i.e. visual clues, corpus statistics and available descriptions). However, they do not utilize semantic similarities among the phrases that might be helpful in relating semantically similar phrases during phrase relevance prediction. In this paper, we extend their model by considering inter-phrase semantic similarities. To compute similarity between two phrases, we consider similarities among their constituent words determined using WordNet. We also re-formulate their objective function for parameter learning by penalizing each pair of phrases unevenly, in a manner similar to that in structured predictions. Various automatic and human evaluations are performed to demonstrate the advantage of our "semantic phrase prediction model" (SPPM) over PPM.
ISSN:2160-7508
2160-7516
DOI:10.1109/CVPRW.2013.50