Loading…

Limousine: Blending Learned and Classical Indexes to Self-Design Larger-than-Memory Cloud Storage Engines

We present Limousine, a self-designing key-value storage engine, that can automatically morph to the near-optimal storage engine architecture shape given a workload, a cloud budget, and target performance. At its core, Limousine identifies the fundamental design principles of storage engines as comb...

Full description

Saved in:
Bibliographic Details
Published in:Proceedings of the ACM on management of data 2024-03, Vol.2 (1), p.1-28, Article 47
Main Authors: Chatterjee, Subarna, Pekala, Mark F., Kruglyak, Lev, Idreos, Stratos
Format: Article
Language:English
Subjects:
Citations: Items that this one cites
Items that cite this one
Online Access:Get full text
Tags: Add Tag
No Tags, Be the first to tag this record!
Description
Summary:We present Limousine, a self-designing key-value storage engine, that can automatically morph to the near-optimal storage engine architecture shape given a workload, a cloud budget, and target performance. At its core, Limousine identifies the fundamental design principles of storage engines as combinations of learned and classical data structures that collaborate through algorithms for data storage and access. By unifying these principles over diverse hardware and three major cloud providers (AWS, GCP, and Azure), Limousine creates a massive design space of quindecillion (1048) storage engine designs the vast majority of which do not exist in literature or industry. Limousine contains a distribution-aware IO model to accurately evaluate any candidate design. Using these models, Limousine searches within the exhaustive design space to construct a navigable continuum of designs connected along a Pareto frontier of cloud cost and performance. If storage engines contain learned components, Limousine also introduces efficient lazy write algorithms to optimize the holistic read-write performance. Once the near-optimal design is decided for the given context, Limousine automatically materializes the corresponding design in Rust code. Using the YCSB benchmark, we demonstrate that storage engines automatically designed and generated by Limousine scale better by up to 3 orders of magnitude when compared with state-of-the-art industry-leading engines such as RocksDB, WiredTiger, FASTER, and Cosine, over diverse workloads, data sets, and cloud budgets.
ISSN:2836-6573
2836-6573
DOI:10.1145/3639302