Loading…

Machine Translation of Closed Captions

Traditional Machine Translation (MT) systems are designed to translate documents. In this paper we describe an MT system that translates the closed captions that accompany most North American television broadcasts. This domain has two identifying characteristics. First, the captions themselves have...

Full description

Saved in:
Bibliographic Details
Published in:Machine translation 2000-01, Vol.15 (4), p.311-341
Main Authors: Popowich, Fred, McFetridge, Paul, Turcato, Davide, Toole, Janine, Turcato, David
Format: Article
Language:English
Subjects:
Online Access:Get full text
Tags: Add Tag
No Tags, Be the first to tag this record!
Description
Summary:Traditional Machine Translation (MT) systems are designed to translate documents. In this paper we describe an MT system that translates the closed captions that accompany most North American television broadcasts. This domain has two identifying characteristics. First, the captions themselves have properties quite different from the type of textual input that many MT systems have been designed for. This is due to the fact that captions generally represent speech and hence contain many of the phenomena that characterize spoken language. Second, the operational characteristics of the closed-caption domain are also quite distinctive. Unlike most other translation domains, the translated captions are only one of several sources of information that are available to the user. In addition, the user has limited time to comprehend the translation since captions only appear on the screen for a few seconds. In this paper, we look at some of the theoretical and implementational challenges that these characteristics pose for MT. We present a fully automatic large-scale multilingual MT system, ALTo. Our approach is based on Whitelock's Shake and Bake MT paradigm, which relies heavily on lexical resources. The system currently provides wide-coverage translation from English to Spanish. In addition to discussing the design of the system, we also address the evaluation issues that are associated with this domain and report on our current performance.
ISSN:0922-6567
1573-0573
DOI:10.1023/a:1012244918183