Loading…
Provenance-based data skipping
Database systems use static analysis to determine upfront which data is needed for answering a query and use indexes and other physical design techniques to speed-up access to that data. However, for important classes of queries, e.g., HAVING and top-k queries, it is impossible to determine up-front...
Saved in:
Published in: | Proceedings of the VLDB Endowment 2021-11, Vol.15 (3), p.451-464 |
---|---|
Main Authors: | , , , , , , , |
Format: | Article |
Language: | English |
Citations: | Items that this one cites Items that cite this one |
Online Access: | Get full text |
Tags: |
Add Tag
No Tags, Be the first to tag this record!
|
cited_by | cdi_FETCH-LOGICAL-c243t-272fce1f95115dc73594b75072a4cb1b6938b83461c5617539283b517d765c73 |
---|---|
cites | cdi_FETCH-LOGICAL-c243t-272fce1f95115dc73594b75072a4cb1b6938b83461c5617539283b517d765c73 |
container_end_page | 464 |
container_issue | 3 |
container_start_page | 451 |
container_title | Proceedings of the VLDB Endowment |
container_volume | 15 |
creator | Niu, Xing Glavic, Boris Liu, Ziyu Li, Pengyuan Gawlick, Dieter Krishnaswamy, Vasudha Liu, Zhen Hua Porobic, Danica |
description | Database systems use static analysis to determine upfront which data is needed for answering a query and use indexes and other physical design techniques to speed-up access to that data. However, for important classes of queries, e.g., HAVING and top-k queries, it is impossible to determine up-front what data is
relevant.
To overcome this limitation, we develop provenance-based data skipping (PBDS), a novel approach that generates provenance sketches to concisely encode what data is relevant for a query. Once a provenance sketch has been captured it is used to speed up subsequent queries. PBDS can exploit physical design artifacts such as indexes and zone maps. |
doi_str_mv | 10.14778/3494124.3494130 |
format | article |
fullrecord | <record><control><sourceid>crossref</sourceid><recordid>TN_cdi_crossref_primary_10_14778_3494124_3494130</recordid><sourceformat>XML</sourceformat><sourcesystem>PC</sourcesystem><sourcerecordid>10_14778_3494124_3494130</sourcerecordid><originalsourceid>FETCH-LOGICAL-c243t-272fce1f95115dc73594b75072a4cb1b6938b83461c5617539283b517d765c73</originalsourceid><addsrcrecordid>eNpNj81KAzEURoNYsLbdu5K-QGpubpKbLKVUKxR00X1IMpky_kyHpAi-vaXOwtX5Fh8HDmN3IFagiOwDKqdAqtWFKK7YVIIW3ApH1__2Dbut9V0IYw3YKbt_K8fv3Ic-ZR5Dzc2yCaewrB_dMHT9Yc4mbfiseTFyxvZPm_16y3evzy_rxx1PUuGJS5JtytA6DaCbRKidiqQFyaBShGgc2mhRGUjaAGl00mLUQA0Zfb7PmPjTpnKsteTWD6X7CuXHg_CXPD_m-TEPfwF-fj-Z</addsrcrecordid><sourcetype>Aggregation Database</sourcetype><iscdi>true</iscdi><recordtype>article</recordtype></control><display><type>article</type><title>Provenance-based data skipping</title><source>Association for Computing Machinery:Jisc Collections:ACM OPEN Journals 2023-2025 (reading list)</source><creator>Niu, Xing ; Glavic, Boris ; Liu, Ziyu ; Li, Pengyuan ; Gawlick, Dieter ; Krishnaswamy, Vasudha ; Liu, Zhen Hua ; Porobic, Danica</creator><creatorcontrib>Niu, Xing ; Glavic, Boris ; Liu, Ziyu ; Li, Pengyuan ; Gawlick, Dieter ; Krishnaswamy, Vasudha ; Liu, Zhen Hua ; Porobic, Danica</creatorcontrib><description>Database systems use static analysis to determine upfront which data is needed for answering a query and use indexes and other physical design techniques to speed-up access to that data. However, for important classes of queries, e.g., HAVING and top-k queries, it is impossible to determine up-front what data is
relevant.
To overcome this limitation, we develop provenance-based data skipping (PBDS), a novel approach that generates provenance sketches to concisely encode what data is relevant for a query. Once a provenance sketch has been captured it is used to speed up subsequent queries. PBDS can exploit physical design artifacts such as indexes and zone maps.</description><identifier>ISSN: 2150-8097</identifier><identifier>EISSN: 2150-8097</identifier><identifier>DOI: 10.14778/3494124.3494130</identifier><language>eng</language><ispartof>Proceedings of the VLDB Endowment, 2021-11, Vol.15 (3), p.451-464</ispartof><lds50>peer_reviewed</lds50><woscitedreferencessubscribed>false</woscitedreferencessubscribed><citedby>FETCH-LOGICAL-c243t-272fce1f95115dc73594b75072a4cb1b6938b83461c5617539283b517d765c73</citedby><cites>FETCH-LOGICAL-c243t-272fce1f95115dc73594b75072a4cb1b6938b83461c5617539283b517d765c73</cites></display><links><openurl>$$Topenurl_article</openurl><openurlfulltext>$$Topenurlfull_article</openurlfulltext><thumbnail>$$Tsyndetics_thumb_exl</thumbnail><link.rule.ids>314,780,784,27924,27925</link.rule.ids></links><search><creatorcontrib>Niu, Xing</creatorcontrib><creatorcontrib>Glavic, Boris</creatorcontrib><creatorcontrib>Liu, Ziyu</creatorcontrib><creatorcontrib>Li, Pengyuan</creatorcontrib><creatorcontrib>Gawlick, Dieter</creatorcontrib><creatorcontrib>Krishnaswamy, Vasudha</creatorcontrib><creatorcontrib>Liu, Zhen Hua</creatorcontrib><creatorcontrib>Porobic, Danica</creatorcontrib><title>Provenance-based data skipping</title><title>Proceedings of the VLDB Endowment</title><description>Database systems use static analysis to determine upfront which data is needed for answering a query and use indexes and other physical design techniques to speed-up access to that data. However, for important classes of queries, e.g., HAVING and top-k queries, it is impossible to determine up-front what data is
relevant.
To overcome this limitation, we develop provenance-based data skipping (PBDS), a novel approach that generates provenance sketches to concisely encode what data is relevant for a query. Once a provenance sketch has been captured it is used to speed up subsequent queries. PBDS can exploit physical design artifacts such as indexes and zone maps.</description><issn>2150-8097</issn><issn>2150-8097</issn><fulltext>true</fulltext><rsrctype>article</rsrctype><creationdate>2021</creationdate><recordtype>article</recordtype><recordid>eNpNj81KAzEURoNYsLbdu5K-QGpubpKbLKVUKxR00X1IMpky_kyHpAi-vaXOwtX5Fh8HDmN3IFagiOwDKqdAqtWFKK7YVIIW3ApH1__2Dbut9V0IYw3YKbt_K8fv3Ic-ZR5Dzc2yCaewrB_dMHT9Yc4mbfiseTFyxvZPm_16y3evzy_rxx1PUuGJS5JtytA6DaCbRKidiqQFyaBShGgc2mhRGUjaAGl00mLUQA0Zfb7PmPjTpnKsteTWD6X7CuXHg_CXPD_m-TEPfwF-fj-Z</recordid><startdate>20211101</startdate><enddate>20211101</enddate><creator>Niu, Xing</creator><creator>Glavic, Boris</creator><creator>Liu, Ziyu</creator><creator>Li, Pengyuan</creator><creator>Gawlick, Dieter</creator><creator>Krishnaswamy, Vasudha</creator><creator>Liu, Zhen Hua</creator><creator>Porobic, Danica</creator><scope>AAYXX</scope><scope>CITATION</scope></search><sort><creationdate>20211101</creationdate><title>Provenance-based data skipping</title><author>Niu, Xing ; Glavic, Boris ; Liu, Ziyu ; Li, Pengyuan ; Gawlick, Dieter ; Krishnaswamy, Vasudha ; Liu, Zhen Hua ; Porobic, Danica</author></sort><facets><frbrtype>5</frbrtype><frbrgroupid>cdi_FETCH-LOGICAL-c243t-272fce1f95115dc73594b75072a4cb1b6938b83461c5617539283b517d765c73</frbrgroupid><rsrctype>articles</rsrctype><prefilter>articles</prefilter><language>eng</language><creationdate>2021</creationdate><toplevel>peer_reviewed</toplevel><toplevel>online_resources</toplevel><creatorcontrib>Niu, Xing</creatorcontrib><creatorcontrib>Glavic, Boris</creatorcontrib><creatorcontrib>Liu, Ziyu</creatorcontrib><creatorcontrib>Li, Pengyuan</creatorcontrib><creatorcontrib>Gawlick, Dieter</creatorcontrib><creatorcontrib>Krishnaswamy, Vasudha</creatorcontrib><creatorcontrib>Liu, Zhen Hua</creatorcontrib><creatorcontrib>Porobic, Danica</creatorcontrib><collection>CrossRef</collection><jtitle>Proceedings of the VLDB Endowment</jtitle></facets><delivery><delcategory>Remote Search Resource</delcategory><fulltext>fulltext</fulltext></delivery><addata><au>Niu, Xing</au><au>Glavic, Boris</au><au>Liu, Ziyu</au><au>Li, Pengyuan</au><au>Gawlick, Dieter</au><au>Krishnaswamy, Vasudha</au><au>Liu, Zhen Hua</au><au>Porobic, Danica</au><format>journal</format><genre>article</genre><ristype>JOUR</ristype><atitle>Provenance-based data skipping</atitle><jtitle>Proceedings of the VLDB Endowment</jtitle><date>2021-11-01</date><risdate>2021</risdate><volume>15</volume><issue>3</issue><spage>451</spage><epage>464</epage><pages>451-464</pages><issn>2150-8097</issn><eissn>2150-8097</eissn><abstract>Database systems use static analysis to determine upfront which data is needed for answering a query and use indexes and other physical design techniques to speed-up access to that data. However, for important classes of queries, e.g., HAVING and top-k queries, it is impossible to determine up-front what data is
relevant.
To overcome this limitation, we develop provenance-based data skipping (PBDS), a novel approach that generates provenance sketches to concisely encode what data is relevant for a query. Once a provenance sketch has been captured it is used to speed up subsequent queries. PBDS can exploit physical design artifacts such as indexes and zone maps.</abstract><doi>10.14778/3494124.3494130</doi><tpages>14</tpages></addata></record> |
fulltext | fulltext |
identifier | ISSN: 2150-8097 |
ispartof | Proceedings of the VLDB Endowment, 2021-11, Vol.15 (3), p.451-464 |
issn | 2150-8097 2150-8097 |
language | eng |
recordid | cdi_crossref_primary_10_14778_3494124_3494130 |
source | Association for Computing Machinery:Jisc Collections:ACM OPEN Journals 2023-2025 (reading list) |
title | Provenance-based data skipping |
url | http://sfxeu10.hosted.exlibrisgroup.com/loughborough?ctx_ver=Z39.88-2004&ctx_enc=info:ofi/enc:UTF-8&ctx_tim=2025-01-02T16%3A53%3A43IST&url_ver=Z39.88-2004&url_ctx_fmt=infofi/fmt:kev:mtx:ctx&rfr_id=info:sid/primo.exlibrisgroup.com:primo3-Article-crossref&rft_val_fmt=info:ofi/fmt:kev:mtx:journal&rft.genre=article&rft.atitle=Provenance-based%20data%20skipping&rft.jtitle=Proceedings%20of%20the%20VLDB%20Endowment&rft.au=Niu,%20Xing&rft.date=2021-11-01&rft.volume=15&rft.issue=3&rft.spage=451&rft.epage=464&rft.pages=451-464&rft.issn=2150-8097&rft.eissn=2150-8097&rft_id=info:doi/10.14778/3494124.3494130&rft_dat=%3Ccrossref%3E10_14778_3494124_3494130%3C/crossref%3E%3Cgrp_id%3Ecdi_FETCH-LOGICAL-c243t-272fce1f95115dc73594b75072a4cb1b6938b83461c5617539283b517d765c73%3C/grp_id%3E%3Coa%3E%3C/oa%3E%3Curl%3E%3C/url%3E&rft_id=info:oai/&rft_id=info:pmid/&rfr_iscdi=true |