Loading…
You Only Read Once (YORO): Learning to Internalize Database Knowledge for Text-to-SQL
While significant progress has been made on the text-to-SQL task, recent solutions repeatedly encode the same database schema for every question, resulting in unnecessary high inference cost and often overlooking crucial database knowledge. To address these issues, we propose You Only Read Once (YOR...
Saved in:
Published in: | arXiv.org 2024-09 |
---|---|
Main Authors: | , , , , , , , |
Format: | Article |
Language: | English |
Subjects: | |
Online Access: | Get full text |
Tags: |
Add Tag
No Tags, Be the first to tag this record!
|
cited_by | |
---|---|
cites | |
container_end_page | |
container_issue | |
container_start_page | |
container_title | arXiv.org |
container_volume | |
creator | Kobayashi, Hideo Lan, Wuwei Shi, Peng Chang, Shuaichen Guo, Jiang Zhu, Henghui Wang, Zhiguo Ng, Patrick |
description | While significant progress has been made on the text-to-SQL task, recent solutions repeatedly encode the same database schema for every question, resulting in unnecessary high inference cost and often overlooking crucial database knowledge. To address these issues, we propose You Only Read Once (YORO), a novel paradigm that directly internalizes database knowledge into the parametric knowledge of a text-to-SQL model during training and eliminates the need for schema encoding during inference. YORO significantly reduces the input token length by 66%-98%. Despite its shorter inputs, our empirical results demonstrate YORO's competitive performances with traditional systems on three benchmarks as well as its significant outperformance on large databases. Furthermore, YORO excels in handling questions with challenging value retrievals such as abbreviation. |
format | article |
fullrecord | <record><control><sourceid>proquest</sourceid><recordid>TN_cdi_proquest_journals_3106850019</recordid><sourceformat>XML</sourceformat><sourcesystem>PC</sourcesystem><sourcerecordid>3106850019</sourcerecordid><originalsourceid>FETCH-proquest_journals_31068500193</originalsourceid><addsrcrecordid>eNqNjMsKgkAUQIcgSMp_uNCmFsI4k2Zte1AUSGYLVzLlVRSZKWekx9dX0Ae0OmdxOB1iMc5dJ5gw1iO21hWllPlT5nncIqdEtRDK-gkRiuxjF4RREkbheA57FI0sZQFGwVYabKSoyxfCUhhxFhphJ9W9xqxAyFUDMT6MY5RzPOwHpJuLWqP9Y58M16t4sXGujbq1qE1aqfa70yl3qR94lLoz_l_1BokuPpY</addsrcrecordid><sourcetype>Aggregation Database</sourcetype><iscdi>true</iscdi><recordtype>article</recordtype><pqid>3106850019</pqid></control><display><type>article</type><title>You Only Read Once (YORO): Learning to Internalize Database Knowledge for Text-to-SQL</title><source>Publicly Available Content Database</source><creator>Kobayashi, Hideo ; Lan, Wuwei ; Shi, Peng ; Chang, Shuaichen ; Guo, Jiang ; Zhu, Henghui ; Wang, Zhiguo ; Ng, Patrick</creator><creatorcontrib>Kobayashi, Hideo ; Lan, Wuwei ; Shi, Peng ; Chang, Shuaichen ; Guo, Jiang ; Zhu, Henghui ; Wang, Zhiguo ; Ng, Patrick</creatorcontrib><description>While significant progress has been made on the text-to-SQL task, recent solutions repeatedly encode the same database schema for every question, resulting in unnecessary high inference cost and often overlooking crucial database knowledge. To address these issues, we propose You Only Read Once (YORO), a novel paradigm that directly internalizes database knowledge into the parametric knowledge of a text-to-SQL model during training and eliminates the need for schema encoding during inference. YORO significantly reduces the input token length by 66%-98%. Despite its shorter inputs, our empirical results demonstrate YORO's competitive performances with traditional systems on three benchmarks as well as its significant outperformance on large databases. Furthermore, YORO excels in handling questions with challenging value retrievals such as abbreviation.</description><identifier>EISSN: 2331-8422</identifier><language>eng</language><publisher>Ithaca: Cornell University Library, arXiv.org</publisher><subject>Inference ; Query languages ; Questions</subject><ispartof>arXiv.org, 2024-09</ispartof><rights>2024. This work is published under http://creativecommons.org/licenses/by-sa/4.0/ (the “License”). Notwithstanding the ProQuest Terms and Conditions, you may use this content in accordance with the terms of the License.</rights><oa>free_for_read</oa><woscitedreferencessubscribed>false</woscitedreferencessubscribed></display><links><openurl>$$Topenurl_article</openurl><openurlfulltext>$$Topenurlfull_article</openurlfulltext><thumbnail>$$Tsyndetics_thumb_exl</thumbnail><linktohtml>$$Uhttps://www.proquest.com/docview/3106850019?pq-origsite=primo$$EHTML$$P50$$Gproquest$$Hfree_for_read</linktohtml><link.rule.ids>780,784,25753,37012,44590</link.rule.ids></links><search><creatorcontrib>Kobayashi, Hideo</creatorcontrib><creatorcontrib>Lan, Wuwei</creatorcontrib><creatorcontrib>Shi, Peng</creatorcontrib><creatorcontrib>Chang, Shuaichen</creatorcontrib><creatorcontrib>Guo, Jiang</creatorcontrib><creatorcontrib>Zhu, Henghui</creatorcontrib><creatorcontrib>Wang, Zhiguo</creatorcontrib><creatorcontrib>Ng, Patrick</creatorcontrib><title>You Only Read Once (YORO): Learning to Internalize Database Knowledge for Text-to-SQL</title><title>arXiv.org</title><description>While significant progress has been made on the text-to-SQL task, recent solutions repeatedly encode the same database schema for every question, resulting in unnecessary high inference cost and often overlooking crucial database knowledge. To address these issues, we propose You Only Read Once (YORO), a novel paradigm that directly internalizes database knowledge into the parametric knowledge of a text-to-SQL model during training and eliminates the need for schema encoding during inference. YORO significantly reduces the input token length by 66%-98%. Despite its shorter inputs, our empirical results demonstrate YORO's competitive performances with traditional systems on three benchmarks as well as its significant outperformance on large databases. Furthermore, YORO excels in handling questions with challenging value retrievals such as abbreviation.</description><subject>Inference</subject><subject>Query languages</subject><subject>Questions</subject><issn>2331-8422</issn><fulltext>true</fulltext><rsrctype>article</rsrctype><creationdate>2024</creationdate><recordtype>article</recordtype><sourceid>PIMPY</sourceid><recordid>eNqNjMsKgkAUQIcgSMp_uNCmFsI4k2Zte1AUSGYLVzLlVRSZKWekx9dX0Ae0OmdxOB1iMc5dJ5gw1iO21hWllPlT5nncIqdEtRDK-gkRiuxjF4RREkbheA57FI0sZQFGwVYabKSoyxfCUhhxFhphJ9W9xqxAyFUDMT6MY5RzPOwHpJuLWqP9Y58M16t4sXGujbq1qE1aqfa70yl3qR94lLoz_l_1BokuPpY</recordid><startdate>20240918</startdate><enddate>20240918</enddate><creator>Kobayashi, Hideo</creator><creator>Lan, Wuwei</creator><creator>Shi, Peng</creator><creator>Chang, Shuaichen</creator><creator>Guo, Jiang</creator><creator>Zhu, Henghui</creator><creator>Wang, Zhiguo</creator><creator>Ng, Patrick</creator><general>Cornell University Library, arXiv.org</general><scope>8FE</scope><scope>8FG</scope><scope>ABJCF</scope><scope>ABUWG</scope><scope>AFKRA</scope><scope>AZQEC</scope><scope>BENPR</scope><scope>BGLVJ</scope><scope>CCPQU</scope><scope>DWQXO</scope><scope>HCIFZ</scope><scope>L6V</scope><scope>M7S</scope><scope>PIMPY</scope><scope>PQEST</scope><scope>PQQKQ</scope><scope>PQUKI</scope><scope>PRINS</scope><scope>PTHSS</scope></search><sort><creationdate>20240918</creationdate><title>You Only Read Once (YORO): Learning to Internalize Database Knowledge for Text-to-SQL</title><author>Kobayashi, Hideo ; Lan, Wuwei ; Shi, Peng ; Chang, Shuaichen ; Guo, Jiang ; Zhu, Henghui ; Wang, Zhiguo ; Ng, Patrick</author></sort><facets><frbrtype>5</frbrtype><frbrgroupid>cdi_FETCH-proquest_journals_31068500193</frbrgroupid><rsrctype>articles</rsrctype><prefilter>articles</prefilter><language>eng</language><creationdate>2024</creationdate><topic>Inference</topic><topic>Query languages</topic><topic>Questions</topic><toplevel>online_resources</toplevel><creatorcontrib>Kobayashi, Hideo</creatorcontrib><creatorcontrib>Lan, Wuwei</creatorcontrib><creatorcontrib>Shi, Peng</creatorcontrib><creatorcontrib>Chang, Shuaichen</creatorcontrib><creatorcontrib>Guo, Jiang</creatorcontrib><creatorcontrib>Zhu, Henghui</creatorcontrib><creatorcontrib>Wang, Zhiguo</creatorcontrib><creatorcontrib>Ng, Patrick</creatorcontrib><collection>ProQuest SciTech Collection</collection><collection>ProQuest Technology Collection</collection><collection>Materials Science & Engineering Collection</collection><collection>ProQuest Central (Alumni)</collection><collection>ProQuest Central</collection><collection>ProQuest Central Essentials</collection><collection>AUTh Library subscriptions: ProQuest Central</collection><collection>Technology Collection</collection><collection>ProQuest One Community College</collection><collection>ProQuest Central Korea</collection><collection>SciTech Premium Collection</collection><collection>ProQuest Engineering Collection</collection><collection>Engineering Database</collection><collection>Publicly Available Content Database</collection><collection>ProQuest One Academic Eastern Edition (DO NOT USE)</collection><collection>ProQuest One Academic</collection><collection>ProQuest One Academic UKI Edition</collection><collection>ProQuest Central China</collection><collection>Engineering Collection</collection></facets><delivery><delcategory>Remote Search Resource</delcategory><fulltext>fulltext</fulltext></delivery><addata><au>Kobayashi, Hideo</au><au>Lan, Wuwei</au><au>Shi, Peng</au><au>Chang, Shuaichen</au><au>Guo, Jiang</au><au>Zhu, Henghui</au><au>Wang, Zhiguo</au><au>Ng, Patrick</au><format>book</format><genre>document</genre><ristype>GEN</ristype><atitle>You Only Read Once (YORO): Learning to Internalize Database Knowledge for Text-to-SQL</atitle><jtitle>arXiv.org</jtitle><date>2024-09-18</date><risdate>2024</risdate><eissn>2331-8422</eissn><abstract>While significant progress has been made on the text-to-SQL task, recent solutions repeatedly encode the same database schema for every question, resulting in unnecessary high inference cost and often overlooking crucial database knowledge. To address these issues, we propose You Only Read Once (YORO), a novel paradigm that directly internalizes database knowledge into the parametric knowledge of a text-to-SQL model during training and eliminates the need for schema encoding during inference. YORO significantly reduces the input token length by 66%-98%. Despite its shorter inputs, our empirical results demonstrate YORO's competitive performances with traditional systems on three benchmarks as well as its significant outperformance on large databases. Furthermore, YORO excels in handling questions with challenging value retrievals such as abbreviation.</abstract><cop>Ithaca</cop><pub>Cornell University Library, arXiv.org</pub><oa>free_for_read</oa></addata></record> |
fulltext | fulltext |
identifier | EISSN: 2331-8422 |
ispartof | arXiv.org, 2024-09 |
issn | 2331-8422 |
language | eng |
recordid | cdi_proquest_journals_3106850019 |
source | Publicly Available Content Database |
subjects | Inference Query languages Questions |
title | You Only Read Once (YORO): Learning to Internalize Database Knowledge for Text-to-SQL |
url | http://sfxeu10.hosted.exlibrisgroup.com/loughborough?ctx_ver=Z39.88-2004&ctx_enc=info:ofi/enc:UTF-8&ctx_tim=2025-01-04T17%3A50%3A19IST&url_ver=Z39.88-2004&url_ctx_fmt=infofi/fmt:kev:mtx:ctx&rfr_id=info:sid/primo.exlibrisgroup.com:primo3-Article-proquest&rft_val_fmt=info:ofi/fmt:kev:mtx:book&rft.genre=document&rft.atitle=You%20Only%20Read%20Once%20(YORO):%20Learning%20to%20Internalize%20Database%20Knowledge%20for%20Text-to-SQL&rft.jtitle=arXiv.org&rft.au=Kobayashi,%20Hideo&rft.date=2024-09-18&rft.eissn=2331-8422&rft_id=info:doi/&rft_dat=%3Cproquest%3E3106850019%3C/proquest%3E%3Cgrp_id%3Ecdi_FETCH-proquest_journals_31068500193%3C/grp_id%3E%3Coa%3E%3C/oa%3E%3Curl%3E%3C/url%3E&rft_id=info:oai/&rft_pqid=3106850019&rft_id=info:pmid/&rfr_iscdi=true |