Loading…

To be an Artist: Automatic Generation on Food Image Aesthetic Captioning

Image aesthetic captioning is a multi-modal task that is to generate aesthetic critiques for images. In contrast to common image captioning tasks, where different captions aimed at providing factual descriptions of a same image are always similar, captions with respect to different aesthetic attribu...

Full description

Saved in:

Bibliographic Details
Main Authors:	Zou, Xiaohan, Lin, Cheng, Zhang, Yinjia, Zhao, Qinpei
Format:	Conference Proceeding
Language:	English
Subjects:	aesthetic critiques Artificial intelligence Coherence Conferences image aesthetic captioning Image coding image-comment data Measurement Task analysis text generation
Online Access:	Request full text
Tags:	Add Tag No Tags, Be the first to tag this record!

Description
Summary:	Image aesthetic captioning is a multi-modal task that is to generate aesthetic critiques for images. In contrast to common image captioning tasks, where different captions aimed at providing factual descriptions of a same image are always similar, captions with respect to different aesthetic attributes of the same image can be totally different in an aesthetic captioning task. Such inter-aspect differences are always overlooked, which leads to the lack of diversity and coherence of the captions generated by most of the existing image aesthetic captioning systems. In this paper, we propose a novel model to generate aesthetic captions for food images. Our model redefines food image aesthetic captioning as a compositional task that consists of two separated modules, i.e., a single-aspect captioning and an unsupervised text compression. The first module is guaranteed to generate the captions and learn feature representations of each aesthetic attribute. Then, the second module is supposed to study the associations among all feature representations and automatically aggregate captions of all aesthetic attributes to a final sentence. We also collect a dataset which contains pair-wise image-comment data related to six aesthetic attributes. Two new evaluation criteria are introduced to comprehensively assess the quality of the generated captions. Experiments on the dataset demonstrate the effectiveness of the proposed model.
ISSN:	2375-0197
DOI:	10.1109/ICTAI50040.2020.00124