Loading…

Semantic High-Level Features for Automated Cross-Modal Slideshow Generation

This paper describes a technical solution for automated slideshow generation by extracting a set of high-level features from music, such as beat grid, mood and genre and intelligently combining this set with image high-level features, such as mood, daytime- and scene classification. An advantage of...

Full description

Saved in:
Bibliographic Details
Main Authors: Dunker, P., Dittmar, C., Begau, A., Nowak, S., Gruhne, M.
Format: Conference Proceeding
Language:English
Subjects:
Online Access:Request full text
Tags: Add Tag
No Tags, Be the first to tag this record!
Description
Summary:This paper describes a technical solution for automated slideshow generation by extracting a set of high-level features from music, such as beat grid, mood and genre and intelligently combining this set with image high-level features, such as mood, daytime- and scene classification. An advantage of this high-level concept is to enable the user to incorporate his preferences regarding the semantic aspects of music and images. For example, the user might request the system to automatically create a slideshow, which plays soft music and shows pictures with sunsets from the last 10 years of his own photo collection.The high-level feature extraction on both, the audio and the visual information is based on the same underlying machine learning core, which processes different audio- and visual- low- and mid-level features. This paper describes the technical realization and evaluation of the algorithms with suitable test databases.
ISSN:1949-3983
1949-3991
DOI:10.1109/CBMI.2009.32