Loading…

Personalized video summary using visual semantic annotations and automatic speech transcriptions

A personalized video summary is dynamically generated in our video personalization and summary system based on user preference and usage environment. The three-tier personalization system adopts the server-middleware-client architecture in order maintain, select, adapt, and deliver rich media conten...

Full description

Saved in:
Bibliographic Details
Main Authors: Tseng, B.L., Ching-Yung Lin
Format: Conference Proceeding
Language:English
Subjects:
Online Access:Request full text
Tags: Add Tag
No Tags, Be the first to tag this record!
Description
Summary:A personalized video summary is dynamically generated in our video personalization and summary system based on user preference and usage environment. The three-tier personalization system adopts the server-middleware-client architecture in order maintain, select, adapt, and deliver rich media content to the user. The server stores the content sources along with their corresponding MPEG-7 metadata descriptions. In this paper, the metadata includes visual semantic annotations and automatic speech transcriptions. Our personalization and summarization engine in the middleware selects the optimal set of desired video segments by matching shot annotations and sentence transcripts with user preferences. The process includes the shot-to-sentence alignment, summary segment selection, and user preference matching and propagation. As a result, the relevant visual shot and audio sentence segments are aggregated and composed into a personalized video summary.
DOI:10.1109/MMSP.2002.1203234