Loading…

Beyond Class A: A Proposal for Automatic Evaluation of Discourse

The DARPA Spoken Language community has just completed the first trial evaluation of spontaneous query/response pairs in the Air Travel (ATIS) domain. Our goal has been to find a methodology for evaluating correct responses to user queries. To this end, we agreed, for the first trial evaluation, to...

Full description

Saved in:
Bibliographic Details
Main Authors: Hirschman, Lynette, Dahl, Deborah A, McKay, Donald P, Norton, Lewis M, Linebarger, Marcia C
Format: Report
Language:English
Subjects:
Online Access:Request full text
Tags: Add Tag
No Tags, Be the first to tag this record!
cited_by
cites
container_end_page
container_issue
container_start_page
container_title
container_volume
creator Hirschman, Lynette
Dahl, Deborah A
McKay, Donald P
Norton, Lewis M
Linebarger, Marcia C
description The DARPA Spoken Language community has just completed the first trial evaluation of spontaneous query/response pairs in the Air Travel (ATIS) domain. Our goal has been to find a methodology for evaluating correct responses to user queries. To this end, we agreed, for the first trial evaluation, to constrain the problem in several ways: Database Application: Constrain the application to a database query application, to ease the burden of a) constructing the back-end, and b) determining correct responses; Canonical Answer: Constrain answer comparison to a minimal canonical answer that imposes the fewest constraints on the form of system response displayed to a user at each site; Typed Input: Constrain the evaluation to typed input only; Class A: Constrain the test set to single unambiguous intelligible utterances taken without context that have well-defined database answers (class A sentences). These were reasonable constraints to impose on the first trial evaluation. However, it is clear that we need to loosen these constraints to obtain a more realistic evaluation of spoken language systems. The purpose of this paper is to suggest how we can move beyond evaluation of class A sentences to an evaluation of connected dialogue, including out-of-domain queries. Sponsored in part by DARPA.
format report
fullrecord <record><control><sourceid>dtic_1RU</sourceid><recordid>TN_cdi_dtic_stinet_ADA458704</recordid><sourceformat>XML</sourceformat><sourcesystem>PC</sourcesystem><sourcerecordid>ADA458704</sourcerecordid><originalsourceid>FETCH-dtic_stinet_ADA4587043</originalsourceid><addsrcrecordid>eNrjZHBwSq3Mz0tRcM5JLC5WcLRScFQIKMovyC9OzFFIyy9ScCwtyc9NLMlMVnAtS8wpBbLy8xTy0xRcMouT80uLilN5GFjTEnOKU3mhNDeDjJtriLOHbgpQU3xxSWZeakm8o4ujiamFuYGJMQFpAFZALWo</addsrcrecordid><sourcetype>Open Access Repository</sourcetype><iscdi>true</iscdi><recordtype>report</recordtype></control><display><type>report</type><title>Beyond Class A: A Proposal for Automatic Evaluation of Discourse</title><source>DTIC Technical Reports</source><creator>Hirschman, Lynette ; Dahl, Deborah A ; McKay, Donald P ; Norton, Lewis M ; Linebarger, Marcia C</creator><creatorcontrib>Hirschman, Lynette ; Dahl, Deborah A ; McKay, Donald P ; Norton, Lewis M ; Linebarger, Marcia C ; UNISYS DEFENSE SYSTEMS PAOLI PA</creatorcontrib><description>The DARPA Spoken Language community has just completed the first trial evaluation of spontaneous query/response pairs in the Air Travel (ATIS) domain. Our goal has been to find a methodology for evaluating correct responses to user queries. To this end, we agreed, for the first trial evaluation, to constrain the problem in several ways: Database Application: Constrain the application to a database query application, to ease the burden of a) constructing the back-end, and b) determining correct responses; Canonical Answer: Constrain answer comparison to a minimal canonical answer that imposes the fewest constraints on the form of system response displayed to a user at each site; Typed Input: Constrain the evaluation to typed input only; Class A: Constrain the test set to single unambiguous intelligible utterances taken without context that have well-defined database answers (class A sentences). These were reasonable constraints to impose on the first trial evaluation. However, it is clear that we need to loosen these constraints to obtain a more realistic evaluation of spoken language systems. The purpose of this paper is to suggest how we can move beyond evaluation of class A sentences to an evaluation of connected dialogue, including out-of-domain queries. Sponsored in part by DARPA.</description><language>eng</language><subject>AUTOMATION ; COMMUNITIES ; DATA BASES ; INPUT ; INTERROGATION ; LANGUAGE ; RESPONSE ; SPEECH ; TEST SETS ; USER NEEDS ; Voice Communications ; WORDS(LANGUAGE)</subject><creationdate>1990</creationdate><rights>Approved for public release; distribution is unlimited.</rights><oa>free_for_read</oa><woscitedreferencessubscribed>false</woscitedreferencessubscribed></display><links><openurl>$$Topenurl_article</openurl><openurlfulltext>$$Topenurlfull_article</openurlfulltext><thumbnail>$$Tsyndetics_thumb_exl</thumbnail><link.rule.ids>230,780,885,27566,27567</link.rule.ids><linktorsrc>$$Uhttps://apps.dtic.mil/sti/citations/ADA458704$$EView_record_in_DTIC$$FView_record_in_$$GDTIC$$Hfree_for_read</linktorsrc></links><search><creatorcontrib>Hirschman, Lynette</creatorcontrib><creatorcontrib>Dahl, Deborah A</creatorcontrib><creatorcontrib>McKay, Donald P</creatorcontrib><creatorcontrib>Norton, Lewis M</creatorcontrib><creatorcontrib>Linebarger, Marcia C</creatorcontrib><creatorcontrib>UNISYS DEFENSE SYSTEMS PAOLI PA</creatorcontrib><title>Beyond Class A: A Proposal for Automatic Evaluation of Discourse</title><description>The DARPA Spoken Language community has just completed the first trial evaluation of spontaneous query/response pairs in the Air Travel (ATIS) domain. Our goal has been to find a methodology for evaluating correct responses to user queries. To this end, we agreed, for the first trial evaluation, to constrain the problem in several ways: Database Application: Constrain the application to a database query application, to ease the burden of a) constructing the back-end, and b) determining correct responses; Canonical Answer: Constrain answer comparison to a minimal canonical answer that imposes the fewest constraints on the form of system response displayed to a user at each site; Typed Input: Constrain the evaluation to typed input only; Class A: Constrain the test set to single unambiguous intelligible utterances taken without context that have well-defined database answers (class A sentences). These were reasonable constraints to impose on the first trial evaluation. However, it is clear that we need to loosen these constraints to obtain a more realistic evaluation of spoken language systems. The purpose of this paper is to suggest how we can move beyond evaluation of class A sentences to an evaluation of connected dialogue, including out-of-domain queries. Sponsored in part by DARPA.</description><subject>AUTOMATION</subject><subject>COMMUNITIES</subject><subject>DATA BASES</subject><subject>INPUT</subject><subject>INTERROGATION</subject><subject>LANGUAGE</subject><subject>RESPONSE</subject><subject>SPEECH</subject><subject>TEST SETS</subject><subject>USER NEEDS</subject><subject>Voice Communications</subject><subject>WORDS(LANGUAGE)</subject><fulltext>true</fulltext><rsrctype>report</rsrctype><creationdate>1990</creationdate><recordtype>report</recordtype><sourceid>1RU</sourceid><recordid>eNrjZHBwSq3Mz0tRcM5JLC5WcLRScFQIKMovyC9OzFFIyy9ScCwtyc9NLMlMVnAtS8wpBbLy8xTy0xRcMouT80uLilN5GFjTEnOKU3mhNDeDjJtriLOHbgpQU3xxSWZeakm8o4ujiamFuYGJMQFpAFZALWo</recordid><startdate>199001</startdate><enddate>199001</enddate><creator>Hirschman, Lynette</creator><creator>Dahl, Deborah A</creator><creator>McKay, Donald P</creator><creator>Norton, Lewis M</creator><creator>Linebarger, Marcia C</creator><scope>1RU</scope><scope>BHM</scope></search><sort><creationdate>199001</creationdate><title>Beyond Class A: A Proposal for Automatic Evaluation of Discourse</title><author>Hirschman, Lynette ; Dahl, Deborah A ; McKay, Donald P ; Norton, Lewis M ; Linebarger, Marcia C</author></sort><facets><frbrtype>5</frbrtype><frbrgroupid>cdi_FETCH-dtic_stinet_ADA4587043</frbrgroupid><rsrctype>reports</rsrctype><prefilter>reports</prefilter><language>eng</language><creationdate>1990</creationdate><topic>AUTOMATION</topic><topic>COMMUNITIES</topic><topic>DATA BASES</topic><topic>INPUT</topic><topic>INTERROGATION</topic><topic>LANGUAGE</topic><topic>RESPONSE</topic><topic>SPEECH</topic><topic>TEST SETS</topic><topic>USER NEEDS</topic><topic>Voice Communications</topic><topic>WORDS(LANGUAGE)</topic><toplevel>online_resources</toplevel><creatorcontrib>Hirschman, Lynette</creatorcontrib><creatorcontrib>Dahl, Deborah A</creatorcontrib><creatorcontrib>McKay, Donald P</creatorcontrib><creatorcontrib>Norton, Lewis M</creatorcontrib><creatorcontrib>Linebarger, Marcia C</creatorcontrib><creatorcontrib>UNISYS DEFENSE SYSTEMS PAOLI PA</creatorcontrib><collection>DTIC Technical Reports</collection><collection>DTIC STINET</collection></facets><delivery><delcategory>Remote Search Resource</delcategory><fulltext>fulltext_linktorsrc</fulltext></delivery><addata><au>Hirschman, Lynette</au><au>Dahl, Deborah A</au><au>McKay, Donald P</au><au>Norton, Lewis M</au><au>Linebarger, Marcia C</au><aucorp>UNISYS DEFENSE SYSTEMS PAOLI PA</aucorp><format>book</format><genre>unknown</genre><ristype>RPRT</ristype><btitle>Beyond Class A: A Proposal for Automatic Evaluation of Discourse</btitle><date>1990-01</date><risdate>1990</risdate><abstract>The DARPA Spoken Language community has just completed the first trial evaluation of spontaneous query/response pairs in the Air Travel (ATIS) domain. Our goal has been to find a methodology for evaluating correct responses to user queries. To this end, we agreed, for the first trial evaluation, to constrain the problem in several ways: Database Application: Constrain the application to a database query application, to ease the burden of a) constructing the back-end, and b) determining correct responses; Canonical Answer: Constrain answer comparison to a minimal canonical answer that imposes the fewest constraints on the form of system response displayed to a user at each site; Typed Input: Constrain the evaluation to typed input only; Class A: Constrain the test set to single unambiguous intelligible utterances taken without context that have well-defined database answers (class A sentences). These were reasonable constraints to impose on the first trial evaluation. However, it is clear that we need to loosen these constraints to obtain a more realistic evaluation of spoken language systems. The purpose of this paper is to suggest how we can move beyond evaluation of class A sentences to an evaluation of connected dialogue, including out-of-domain queries. Sponsored in part by DARPA.</abstract><oa>free_for_read</oa></addata></record>
fulltext fulltext_linktorsrc
identifier
ispartof
issn
language eng
recordid cdi_dtic_stinet_ADA458704
source DTIC Technical Reports
subjects AUTOMATION
COMMUNITIES
DATA BASES
INPUT
INTERROGATION
LANGUAGE
RESPONSE
SPEECH
TEST SETS
USER NEEDS
Voice Communications
WORDS(LANGUAGE)
title Beyond Class A: A Proposal for Automatic Evaluation of Discourse
url http://sfxeu10.hosted.exlibrisgroup.com/loughborough?ctx_ver=Z39.88-2004&ctx_enc=info:ofi/enc:UTF-8&ctx_tim=2025-01-08T17%3A28%3A45IST&url_ver=Z39.88-2004&url_ctx_fmt=infofi/fmt:kev:mtx:ctx&rfr_id=info:sid/primo.exlibrisgroup.com:primo3-Article-dtic_1RU&rft_val_fmt=info:ofi/fmt:kev:mtx:book&rft.genre=unknown&rft.btitle=Beyond%20Class%20A:%20A%20Proposal%20for%20Automatic%20Evaluation%20of%20Discourse&rft.au=Hirschman,%20Lynette&rft.aucorp=UNISYS%20DEFENSE%20SYSTEMS%20PAOLI%20PA&rft.date=1990-01&rft_id=info:doi/&rft_dat=%3Cdtic_1RU%3EADA458704%3C/dtic_1RU%3E%3Cgrp_id%3Ecdi_FETCH-dtic_stinet_ADA4587043%3C/grp_id%3E%3Coa%3E%3C/oa%3E%3Curl%3E%3C/url%3E&rft_id=info:oai/&rft_id=info:pmid/&rfr_iscdi=true