Loading…

Provenance-based data skipping

Database systems use static analysis to determine upfront which data is needed for answering a query and use indexes and other physical design techniques to speed-up access to that data. However, for important classes of queries, e.g., HAVING and top-k queries, it is impossible to determine up-front...

Full description

Saved in:
Bibliographic Details
Published in:Proceedings of the VLDB Endowment 2021-11, Vol.15 (3), p.451-464
Main Authors: Niu, Xing, Glavic, Boris, Liu, Ziyu, Li, Pengyuan, Gawlick, Dieter, Krishnaswamy, Vasudha, Liu, Zhen Hua, Porobic, Danica
Format: Article
Language:English
Citations: Items that this one cites
Items that cite this one
Online Access:Get full text
Tags: Add Tag
No Tags, Be the first to tag this record!
Description
Summary:Database systems use static analysis to determine upfront which data is needed for answering a query and use indexes and other physical design techniques to speed-up access to that data. However, for important classes of queries, e.g., HAVING and top-k queries, it is impossible to determine up-front what data is relevant. To overcome this limitation, we develop provenance-based data skipping (PBDS), a novel approach that generates provenance sketches to concisely encode what data is relevant for a query. Once a provenance sketch has been captured it is used to speed up subsequent queries. PBDS can exploit physical design artifacts such as indexes and zone maps.
ISSN:2150-8097
2150-8097
DOI:10.14778/3494124.3494130