Loading…
A Novel Attribute Selection Mechanism for Video Captioning
Attributes are more and more popular for enhancing the performance of video captioning which requires semantic understanding of videos and the ability of generating natural language descriptions. However, existing methods have flaws in detecting visual attributes. As a result, the captioning model m...
Saved in:
Main Authors: | , |
---|---|
Format: | Conference Proceeding |
Language: | English |
Subjects: | |
Online Access: | Request full text |
Tags: |
Add Tag
No Tags, Be the first to tag this record!
|
Summary: | Attributes are more and more popular for enhancing the performance of video captioning which requires semantic understanding of videos and the ability of generating natural language descriptions. However, existing methods have flaws in detecting visual attributes. As a result, the captioning model may be misled by those erroneous attributes. Besides, each semantic attribute plays a different role in the next-word generation. How to utilize them effectively in video captioning task is still a challenge. To tackle these problems, in this paper, we propose a novel framework which imposes an attention mechanism guided by the visual attention on the detected video attributes to make a soft-selection over them. Simultaneously, the reinforcement learning algorithm is employed with the motivation to better select the useful attributes. Experimental results on benchmark datasets demonstrate that the proposed attribute selection mechanism can focus on appropriate attributes and boost the caption models. |
---|---|
ISSN: | 2381-8549 |
DOI: | 10.1109/ICIP.2019.8803785 |