Loading…

ASTS: attention based spatio-temporal sequential framework for movie trailer genre classification

Automatic movie trailer genre classification is a challenging task because trailers have more diverse content and high-level sequential semantic concepts within the movie storyline, which can help for multimedia search and personalized movie recommendation. Traditional methods generally extract the...

Full description

Saved in:
Bibliographic Details
Published in:Multimedia tools and applications 2021-03, Vol.80 (7), p.9749-9764
Main Authors: Yu, Yitong, Lu, Ziyu, Li, Yang, Liu, Delong
Format: Article
Language:English
Subjects:
Citations: Items that this one cites
Items that cite this one
Online Access:Get full text
Tags: Add Tag
No Tags, Be the first to tag this record!
Description
Summary:Automatic movie trailer genre classification is a challenging task because trailers have more diverse content and high-level sequential semantic concepts within the movie storyline, which can help for multimedia search and personalized movie recommendation. Traditional methods generally extract the low-level features or consider the local sequential dependencies among trailer frames, ignoring the global high-level sequential semantic concepts. In this manuscript, we propose a novel and effective Attention based Spatio-temporal Sequential Framework (ASTS) for movie trailer genre classification. The proposed framework mainly consists of two modules, respectively the spatio-temporal descriptive module and the attention-based sequential module. The spatio-temporal descriptive module adopts some advanced convolution neural networks to extract the spatio-temporal features of key trailer frames, which can capture the local spatio-temporal semantic features. The attention-based sequential module is designed to process the extracted spatio-temporal feature representation sequence for capturing the global high-level sequential semantic concepts within the movie storyline. We crawl 14,415 labeled movie trailers from YouTube and integrate them into the public dataset MovieLens. Experiment results show that our proposed framework is superior to state-of-the-art methods.
ISSN:1380-7501
1573-7721
DOI:10.1007/s11042-020-10125-y