Loading…

A Novel Attribute Selection Mechanism for Video Captioning

Attributes are more and more popular for enhancing the performance of video captioning which requires semantic understanding of videos and the ability of generating natural language descriptions. However, existing methods have flaws in detecting visual attributes. As a result, the captioning model m...

Full description

Saved in:
Bibliographic Details
Main Authors: Xiao, Huanhou, Shi, Jinglun
Format: Conference Proceeding
Language:English
Subjects:
Online Access:Request full text
Tags: Add Tag
No Tags, Be the first to tag this record!
Description
Summary:Attributes are more and more popular for enhancing the performance of video captioning which requires semantic understanding of videos and the ability of generating natural language descriptions. However, existing methods have flaws in detecting visual attributes. As a result, the captioning model may be misled by those erroneous attributes. Besides, each semantic attribute plays a different role in the next-word generation. How to utilize them effectively in video captioning task is still a challenge. To tackle these problems, in this paper, we propose a novel framework which imposes an attention mechanism guided by the visual attention on the detected video attributes to make a soft-selection over them. Simultaneously, the reinforcement learning algorithm is employed with the motivation to better select the useful attributes. Experimental results on benchmark datasets demonstrate that the proposed attribute selection mechanism can focus on appropriate attributes and boost the caption models.
ISSN:2381-8549
DOI:10.1109/ICIP.2019.8803785