Loading…

Provenance-based data skipping

Database systems use static analysis to determine upfront which data is needed for answering a query and use indexes and other physical design techniques to speed-up access to that data. However, for important classes of queries, e.g., HAVING and top-k queries, it is impossible to determine up-front...

Full description

Saved in:
Bibliographic Details
Published in:Proceedings of the VLDB Endowment 2021-11, Vol.15 (3), p.451-464
Main Authors: Niu, Xing, Glavic, Boris, Liu, Ziyu, Li, Pengyuan, Gawlick, Dieter, Krishnaswamy, Vasudha, Liu, Zhen Hua, Porobic, Danica
Format: Article
Language:English
Citations: Items that this one cites
Items that cite this one
Online Access:Get full text
Tags: Add Tag
No Tags, Be the first to tag this record!
cited_by cdi_FETCH-LOGICAL-c243t-272fce1f95115dc73594b75072a4cb1b6938b83461c5617539283b517d765c73
cites cdi_FETCH-LOGICAL-c243t-272fce1f95115dc73594b75072a4cb1b6938b83461c5617539283b517d765c73
container_end_page 464
container_issue 3
container_start_page 451
container_title Proceedings of the VLDB Endowment
container_volume 15
creator Niu, Xing
Glavic, Boris
Liu, Ziyu
Li, Pengyuan
Gawlick, Dieter
Krishnaswamy, Vasudha
Liu, Zhen Hua
Porobic, Danica
description Database systems use static analysis to determine upfront which data is needed for answering a query and use indexes and other physical design techniques to speed-up access to that data. However, for important classes of queries, e.g., HAVING and top-k queries, it is impossible to determine up-front what data is relevant. To overcome this limitation, we develop provenance-based data skipping (PBDS), a novel approach that generates provenance sketches to concisely encode what data is relevant for a query. Once a provenance sketch has been captured it is used to speed up subsequent queries. PBDS can exploit physical design artifacts such as indexes and zone maps.
doi_str_mv 10.14778/3494124.3494130
format article
fullrecord <record><control><sourceid>crossref</sourceid><recordid>TN_cdi_crossref_primary_10_14778_3494124_3494130</recordid><sourceformat>XML</sourceformat><sourcesystem>PC</sourcesystem><sourcerecordid>10_14778_3494124_3494130</sourcerecordid><originalsourceid>FETCH-LOGICAL-c243t-272fce1f95115dc73594b75072a4cb1b6938b83461c5617539283b517d765c73</originalsourceid><addsrcrecordid>eNpNj81KAzEURoNYsLbdu5K-QGpubpKbLKVUKxR00X1IMpky_kyHpAi-vaXOwtX5Fh8HDmN3IFagiOwDKqdAqtWFKK7YVIIW3ApH1__2Dbut9V0IYw3YKbt_K8fv3Ic-ZR5Dzc2yCaewrB_dMHT9Yc4mbfiseTFyxvZPm_16y3evzy_rxx1PUuGJS5JtytA6DaCbRKidiqQFyaBShGgc2mhRGUjaAGl00mLUQA0Zfb7PmPjTpnKsteTWD6X7CuXHg_CXPD_m-TEPfwF-fj-Z</addsrcrecordid><sourcetype>Aggregation Database</sourcetype><iscdi>true</iscdi><recordtype>article</recordtype></control><display><type>article</type><title>Provenance-based data skipping</title><source>Association for Computing Machinery:Jisc Collections:ACM OPEN Journals 2023-2025 (reading list)</source><creator>Niu, Xing ; Glavic, Boris ; Liu, Ziyu ; Li, Pengyuan ; Gawlick, Dieter ; Krishnaswamy, Vasudha ; Liu, Zhen Hua ; Porobic, Danica</creator><creatorcontrib>Niu, Xing ; Glavic, Boris ; Liu, Ziyu ; Li, Pengyuan ; Gawlick, Dieter ; Krishnaswamy, Vasudha ; Liu, Zhen Hua ; Porobic, Danica</creatorcontrib><description>Database systems use static analysis to determine upfront which data is needed for answering a query and use indexes and other physical design techniques to speed-up access to that data. However, for important classes of queries, e.g., HAVING and top-k queries, it is impossible to determine up-front what data is relevant. To overcome this limitation, we develop provenance-based data skipping (PBDS), a novel approach that generates provenance sketches to concisely encode what data is relevant for a query. Once a provenance sketch has been captured it is used to speed up subsequent queries. PBDS can exploit physical design artifacts such as indexes and zone maps.</description><identifier>ISSN: 2150-8097</identifier><identifier>EISSN: 2150-8097</identifier><identifier>DOI: 10.14778/3494124.3494130</identifier><language>eng</language><ispartof>Proceedings of the VLDB Endowment, 2021-11, Vol.15 (3), p.451-464</ispartof><lds50>peer_reviewed</lds50><woscitedreferencessubscribed>false</woscitedreferencessubscribed><citedby>FETCH-LOGICAL-c243t-272fce1f95115dc73594b75072a4cb1b6938b83461c5617539283b517d765c73</citedby><cites>FETCH-LOGICAL-c243t-272fce1f95115dc73594b75072a4cb1b6938b83461c5617539283b517d765c73</cites></display><links><openurl>$$Topenurl_article</openurl><openurlfulltext>$$Topenurlfull_article</openurlfulltext><thumbnail>$$Tsyndetics_thumb_exl</thumbnail><link.rule.ids>314,780,784,27924,27925</link.rule.ids></links><search><creatorcontrib>Niu, Xing</creatorcontrib><creatorcontrib>Glavic, Boris</creatorcontrib><creatorcontrib>Liu, Ziyu</creatorcontrib><creatorcontrib>Li, Pengyuan</creatorcontrib><creatorcontrib>Gawlick, Dieter</creatorcontrib><creatorcontrib>Krishnaswamy, Vasudha</creatorcontrib><creatorcontrib>Liu, Zhen Hua</creatorcontrib><creatorcontrib>Porobic, Danica</creatorcontrib><title>Provenance-based data skipping</title><title>Proceedings of the VLDB Endowment</title><description>Database systems use static analysis to determine upfront which data is needed for answering a query and use indexes and other physical design techniques to speed-up access to that data. However, for important classes of queries, e.g., HAVING and top-k queries, it is impossible to determine up-front what data is relevant. To overcome this limitation, we develop provenance-based data skipping (PBDS), a novel approach that generates provenance sketches to concisely encode what data is relevant for a query. Once a provenance sketch has been captured it is used to speed up subsequent queries. PBDS can exploit physical design artifacts such as indexes and zone maps.</description><issn>2150-8097</issn><issn>2150-8097</issn><fulltext>true</fulltext><rsrctype>article</rsrctype><creationdate>2021</creationdate><recordtype>article</recordtype><recordid>eNpNj81KAzEURoNYsLbdu5K-QGpubpKbLKVUKxR00X1IMpky_kyHpAi-vaXOwtX5Fh8HDmN3IFagiOwDKqdAqtWFKK7YVIIW3ApH1__2Dbut9V0IYw3YKbt_K8fv3Ic-ZR5Dzc2yCaewrB_dMHT9Yc4mbfiseTFyxvZPm_16y3evzy_rxx1PUuGJS5JtytA6DaCbRKidiqQFyaBShGgc2mhRGUjaAGl00mLUQA0Zfb7PmPjTpnKsteTWD6X7CuXHg_CXPD_m-TEPfwF-fj-Z</recordid><startdate>20211101</startdate><enddate>20211101</enddate><creator>Niu, Xing</creator><creator>Glavic, Boris</creator><creator>Liu, Ziyu</creator><creator>Li, Pengyuan</creator><creator>Gawlick, Dieter</creator><creator>Krishnaswamy, Vasudha</creator><creator>Liu, Zhen Hua</creator><creator>Porobic, Danica</creator><scope>AAYXX</scope><scope>CITATION</scope></search><sort><creationdate>20211101</creationdate><title>Provenance-based data skipping</title><author>Niu, Xing ; Glavic, Boris ; Liu, Ziyu ; Li, Pengyuan ; Gawlick, Dieter ; Krishnaswamy, Vasudha ; Liu, Zhen Hua ; Porobic, Danica</author></sort><facets><frbrtype>5</frbrtype><frbrgroupid>cdi_FETCH-LOGICAL-c243t-272fce1f95115dc73594b75072a4cb1b6938b83461c5617539283b517d765c73</frbrgroupid><rsrctype>articles</rsrctype><prefilter>articles</prefilter><language>eng</language><creationdate>2021</creationdate><toplevel>peer_reviewed</toplevel><toplevel>online_resources</toplevel><creatorcontrib>Niu, Xing</creatorcontrib><creatorcontrib>Glavic, Boris</creatorcontrib><creatorcontrib>Liu, Ziyu</creatorcontrib><creatorcontrib>Li, Pengyuan</creatorcontrib><creatorcontrib>Gawlick, Dieter</creatorcontrib><creatorcontrib>Krishnaswamy, Vasudha</creatorcontrib><creatorcontrib>Liu, Zhen Hua</creatorcontrib><creatorcontrib>Porobic, Danica</creatorcontrib><collection>CrossRef</collection><jtitle>Proceedings of the VLDB Endowment</jtitle></facets><delivery><delcategory>Remote Search Resource</delcategory><fulltext>fulltext</fulltext></delivery><addata><au>Niu, Xing</au><au>Glavic, Boris</au><au>Liu, Ziyu</au><au>Li, Pengyuan</au><au>Gawlick, Dieter</au><au>Krishnaswamy, Vasudha</au><au>Liu, Zhen Hua</au><au>Porobic, Danica</au><format>journal</format><genre>article</genre><ristype>JOUR</ristype><atitle>Provenance-based data skipping</atitle><jtitle>Proceedings of the VLDB Endowment</jtitle><date>2021-11-01</date><risdate>2021</risdate><volume>15</volume><issue>3</issue><spage>451</spage><epage>464</epage><pages>451-464</pages><issn>2150-8097</issn><eissn>2150-8097</eissn><abstract>Database systems use static analysis to determine upfront which data is needed for answering a query and use indexes and other physical design techniques to speed-up access to that data. However, for important classes of queries, e.g., HAVING and top-k queries, it is impossible to determine up-front what data is relevant. To overcome this limitation, we develop provenance-based data skipping (PBDS), a novel approach that generates provenance sketches to concisely encode what data is relevant for a query. Once a provenance sketch has been captured it is used to speed up subsequent queries. PBDS can exploit physical design artifacts such as indexes and zone maps.</abstract><doi>10.14778/3494124.3494130</doi><tpages>14</tpages></addata></record>
fulltext fulltext
identifier ISSN: 2150-8097
ispartof Proceedings of the VLDB Endowment, 2021-11, Vol.15 (3), p.451-464
issn 2150-8097
2150-8097
language eng
recordid cdi_crossref_primary_10_14778_3494124_3494130
source Association for Computing Machinery:Jisc Collections:ACM OPEN Journals 2023-2025 (reading list)
title Provenance-based data skipping
url http://sfxeu10.hosted.exlibrisgroup.com/loughborough?ctx_ver=Z39.88-2004&ctx_enc=info:ofi/enc:UTF-8&ctx_tim=2025-01-02T16%3A53%3A43IST&url_ver=Z39.88-2004&url_ctx_fmt=infofi/fmt:kev:mtx:ctx&rfr_id=info:sid/primo.exlibrisgroup.com:primo3-Article-crossref&rft_val_fmt=info:ofi/fmt:kev:mtx:journal&rft.genre=article&rft.atitle=Provenance-based%20data%20skipping&rft.jtitle=Proceedings%20of%20the%20VLDB%20Endowment&rft.au=Niu,%20Xing&rft.date=2021-11-01&rft.volume=15&rft.issue=3&rft.spage=451&rft.epage=464&rft.pages=451-464&rft.issn=2150-8097&rft.eissn=2150-8097&rft_id=info:doi/10.14778/3494124.3494130&rft_dat=%3Ccrossref%3E10_14778_3494124_3494130%3C/crossref%3E%3Cgrp_id%3Ecdi_FETCH-LOGICAL-c243t-272fce1f95115dc73594b75072a4cb1b6938b83461c5617539283b517d765c73%3C/grp_id%3E%3Coa%3E%3C/oa%3E%3Curl%3E%3C/url%3E&rft_id=info:oai/&rft_id=info:pmid/&rfr_iscdi=true