Loading…

Cardinality estimation via learned dynamic sample selection

Sampling is an effective approach to cardinality estimation which in turn is a key to query optimization in a DBMS. Although there have been a lot of studies on applying machine learning to cardinality estimation recently, enhancing sampling-based cardinality estimation by machine learning has been...

Full description

Saved in:
Bibliographic Details
Published in:Information systems (Oxford) 2023-07, Vol.117, p.102252, Article 102252
Main Authors: Wang, Run-An, Zou, Zhaonian, Jing, Ziqi
Format: Article
Language:English
Subjects:
Citations: Items that this one cites
Online Access:Get full text
Tags: Add Tag
No Tags, Be the first to tag this record!
Description
Summary:Sampling is an effective approach to cardinality estimation which in turn is a key to query optimization in a DBMS. Although there have been a lot of studies on applying machine learning to cardinality estimation recently, enhancing sampling-based cardinality estimation by machine learning has been overlooked for a long time. In this paper, we propose a new sampling-based cardinality estimation method called LDSS by developing a learning-based dynamic sample selection method. Unlike the existing sampling-based methods that perform online sampling for every query, our method selects for the query the most suitable sample from the set of samples of various sizes that have been materialized during preprocessing. The cardinality of the query is then estimated based on the selected sample. Since our method is based on sampling, it can handle both single-table queries and join queries. Due to dynamic sample selection, costly online sampling is completely avoided. By learning the complex relationships between samples and queries, our learned sample selector can recommend small yet good samples for input queries. The extensive evaluation performed on the benchmarks indicates that LDSS can be trained much faster and can achieve higher accuracy than the state-of-the-art query-dependent methods and comparable accuracy to the current data-driven methods. •We propose the first cardinality estimation framework using dynamic sample selection.•We propose LDSS, a learning-based approach to dynamic sample selection.•We develop a method to incrementally update the learned sample selection model.•We experimentally evaluated the performance of LDSS using a set of benchmarks.•LDSS can achieve better accuracy than the state-of-the-art query-dependent methods.
ISSN:0306-4379
1873-6076
DOI:10.1016/j.is.2023.102252