Loading…

SpotKV: Improving Read Throughput of KVS by I/O-Aware Cache and Adaptive Cuckoo Filters

LSM tree based stores are a popular database design in modern persistent storage systems due to their efficient writes with sorted keys. However, this hierarchical log structure suffers from extensive read amplification because multiple disk accesses are required when it searches for a key. Recent o...

Full description

Saved in:
Bibliographic Details
Main Authors: Liu, Yi, Zhou, Ruilin, Gan, Yuhang, Qian, Chen
Format: Conference Proceeding
Language:English
Subjects:
Online Access:Request full text
Tags: Add Tag
No Tags, Be the first to tag this record!
cited_by
cites
container_end_page 354
container_issue
container_start_page 344
container_title
container_volume
creator Liu, Yi
Zhou, Ruilin
Gan, Yuhang
Qian, Chen
description LSM tree based stores are a popular database design in modern persistent storage systems due to their efficient writes with sorted keys. However, this hierarchical log structure suffers from extensive read amplification because multiple disk accesses are required when it searches for a key. Recent optimizations of LSM trees propose caching hot keys to reduce I/Os mainly based on their access frequencies. However, our empirical studies show that keys are different in I/O costs, which should also be considered in the caching policy: caching key-value pairs with high I/O cost can effectively improve query latency. In addition, false positives incurred by the Bloom filters in LSM trees introduce a large overhead to access SSTables because the queried keys do not exist. In this work, we design and implement SpotKV, which resolves the above two problems in an LSM tree store by proposing two memory-efficient data structures, weighted Count- Min sketch for access and I/O-aware cache admission and dynamic-seed Cuckoo filters for eliminating false positives, to improve data lookup throughput. We implement SpotKV on Google's LevelDB vl.20. From extensive experimental evaluations, SpotKV achieves 1.2-3.0x read throughput while using the same or smaller memory, compared with several state- of-the-art LSM tree stores under the read-heavy workloads of the YCSB benchmarks.
doi_str_mv 10.1109/CLOUD62652.2024.00046
format conference_proceeding
fullrecord <record><control><sourceid>ieee_CHZPO</sourceid><recordid>TN_cdi_ieee_primary_10643922</recordid><sourceformat>XML</sourceformat><sourcesystem>PC</sourcesystem><ieee_id>10643922</ieee_id><sourcerecordid>10643922</sourcerecordid><originalsourceid>FETCH-LOGICAL-i106t-614113d83b12739470807eb14439db2241c91c231970d9e1c512d1f0a6767b033</originalsourceid><addsrcrecordid>eNotzttKw0AYBOBVECw1b6CwL5D2__eY9a5Eq6GFgj14WTbZTRNtm5BDpW9vQK8GhuFjCHlCmCCCmcbL1fZFMSXZhAETEwAQ6oYERpuIS-AqklzdkhFDaUKFBu5J0LZfwwwhkhL5iHyu66pb7J5pcqqb6lKeD_TDW0c3RVP1h6LuO1rldLFb0_RKk-kqnP3YxtPYZoWn9uzozNm6Ky9D1WffVUXn5bHzTftA7nJ7bH3wn2Oynb9u4vdwuXpL4tkyLBFUN5wSiNxFPEWmuREaItA-RSG4cSljAjODGeNoNDjjMZPIHOZglVY6Bc7H5PHPLb33-7opT7a57gd7ABjjvypNT9U</addsrcrecordid><sourcetype>Publisher</sourcetype><iscdi>true</iscdi><recordtype>conference_proceeding</recordtype></control><display><type>conference_proceeding</type><title>SpotKV: Improving Read Throughput of KVS by I/O-Aware Cache and Adaptive Cuckoo Filters</title><source>IEEE Xplore All Conference Series</source><creator>Liu, Yi ; Zhou, Ruilin ; Gan, Yuhang ; Qian, Chen</creator><creatorcontrib>Liu, Yi ; Zhou, Ruilin ; Gan, Yuhang ; Qian, Chen</creatorcontrib><description>LSM tree based stores are a popular database design in modern persistent storage systems due to their efficient writes with sorted keys. However, this hierarchical log structure suffers from extensive read amplification because multiple disk accesses are required when it searches for a key. Recent optimizations of LSM trees propose caching hot keys to reduce I/Os mainly based on their access frequencies. However, our empirical studies show that keys are different in I/O costs, which should also be considered in the caching policy: caching key-value pairs with high I/O cost can effectively improve query latency. In addition, false positives incurred by the Bloom filters in LSM trees introduce a large overhead to access SSTables because the queried keys do not exist. In this work, we design and implement SpotKV, which resolves the above two problems in an LSM tree store by proposing two memory-efficient data structures, weighted Count- Min sketch for access and I/O-aware cache admission and dynamic-seed Cuckoo filters for eliminating false positives, to improve data lookup throughput. We implement SpotKV on Google's LevelDB vl.20. From extensive experimental evaluations, SpotKV achieves 1.2-3.0x read throughput while using the same or smaller memory, compared with several state- of-the-art LSM tree stores under the read-heavy workloads of the YCSB benchmarks.</description><identifier>EISSN: 2159-6190</identifier><identifier>EISBN: 9798350368536</identifier><identifier>DOI: 10.1109/CLOUD62652.2024.00046</identifier><identifier>CODEN: IEEPAD</identifier><language>eng</language><publisher>IEEE</publisher><subject>Benchmark testing ; Caching ; Cloud computing ; Costs ; Count-Min sketch ; Cuckoo filter ; Data structures ; Filters ; KV store ; LSM tree ; Throughput</subject><ispartof>2024 IEEE 17th International Conference on Cloud Computing (CLOUD), 2024, p.344-354</ispartof><woscitedreferencessubscribed>false</woscitedreferencessubscribed></display><links><openurl>$$Topenurl_article</openurl><openurlfulltext>$$Topenurlfull_article</openurlfulltext><thumbnail>$$Tsyndetics_thumb_exl</thumbnail><linktohtml>$$Uhttps://ieeexplore.ieee.org/document/10643922$$EHTML$$P50$$Gieee$$H</linktohtml><link.rule.ids>309,310,776,780,785,786,27902,54530,54907</link.rule.ids><linktorsrc>$$Uhttps://ieeexplore.ieee.org/document/10643922$$EView_record_in_IEEE$$FView_record_in_$$GIEEE</linktorsrc></links><search><creatorcontrib>Liu, Yi</creatorcontrib><creatorcontrib>Zhou, Ruilin</creatorcontrib><creatorcontrib>Gan, Yuhang</creatorcontrib><creatorcontrib>Qian, Chen</creatorcontrib><title>SpotKV: Improving Read Throughput of KVS by I/O-Aware Cache and Adaptive Cuckoo Filters</title><title>2024 IEEE 17th International Conference on Cloud Computing (CLOUD)</title><addtitle>CLOUD</addtitle><description>LSM tree based stores are a popular database design in modern persistent storage systems due to their efficient writes with sorted keys. However, this hierarchical log structure suffers from extensive read amplification because multiple disk accesses are required when it searches for a key. Recent optimizations of LSM trees propose caching hot keys to reduce I/Os mainly based on their access frequencies. However, our empirical studies show that keys are different in I/O costs, which should also be considered in the caching policy: caching key-value pairs with high I/O cost can effectively improve query latency. In addition, false positives incurred by the Bloom filters in LSM trees introduce a large overhead to access SSTables because the queried keys do not exist. In this work, we design and implement SpotKV, which resolves the above two problems in an LSM tree store by proposing two memory-efficient data structures, weighted Count- Min sketch for access and I/O-aware cache admission and dynamic-seed Cuckoo filters for eliminating false positives, to improve data lookup throughput. We implement SpotKV on Google's LevelDB vl.20. From extensive experimental evaluations, SpotKV achieves 1.2-3.0x read throughput while using the same or smaller memory, compared with several state- of-the-art LSM tree stores under the read-heavy workloads of the YCSB benchmarks.</description><subject>Benchmark testing</subject><subject>Caching</subject><subject>Cloud computing</subject><subject>Costs</subject><subject>Count-Min sketch</subject><subject>Cuckoo filter</subject><subject>Data structures</subject><subject>Filters</subject><subject>KV store</subject><subject>LSM tree</subject><subject>Throughput</subject><issn>2159-6190</issn><isbn>9798350368536</isbn><fulltext>true</fulltext><rsrctype>conference_proceeding</rsrctype><creationdate>2024</creationdate><recordtype>conference_proceeding</recordtype><sourceid>6IE</sourceid><recordid>eNotzttKw0AYBOBVECw1b6CwL5D2__eY9a5Eq6GFgj14WTbZTRNtm5BDpW9vQK8GhuFjCHlCmCCCmcbL1fZFMSXZhAETEwAQ6oYERpuIS-AqklzdkhFDaUKFBu5J0LZfwwwhkhL5iHyu66pb7J5pcqqb6lKeD_TDW0c3RVP1h6LuO1rldLFb0_RKk-kqnP3YxtPYZoWn9uzozNm6Ky9D1WffVUXn5bHzTftA7nJ7bH3wn2Oynb9u4vdwuXpL4tkyLBFUN5wSiNxFPEWmuREaItA-RSG4cSljAjODGeNoNDjjMZPIHOZglVY6Bc7H5PHPLb33-7opT7a57gd7ABjjvypNT9U</recordid><startdate>20240707</startdate><enddate>20240707</enddate><creator>Liu, Yi</creator><creator>Zhou, Ruilin</creator><creator>Gan, Yuhang</creator><creator>Qian, Chen</creator><general>IEEE</general><scope>6IE</scope><scope>6IL</scope><scope>CBEJK</scope><scope>RIE</scope><scope>RIL</scope></search><sort><creationdate>20240707</creationdate><title>SpotKV: Improving Read Throughput of KVS by I/O-Aware Cache and Adaptive Cuckoo Filters</title><author>Liu, Yi ; Zhou, Ruilin ; Gan, Yuhang ; Qian, Chen</author></sort><facets><frbrtype>5</frbrtype><frbrgroupid>cdi_FETCH-LOGICAL-i106t-614113d83b12739470807eb14439db2241c91c231970d9e1c512d1f0a6767b033</frbrgroupid><rsrctype>conference_proceedings</rsrctype><prefilter>conference_proceedings</prefilter><language>eng</language><creationdate>2024</creationdate><topic>Benchmark testing</topic><topic>Caching</topic><topic>Cloud computing</topic><topic>Costs</topic><topic>Count-Min sketch</topic><topic>Cuckoo filter</topic><topic>Data structures</topic><topic>Filters</topic><topic>KV store</topic><topic>LSM tree</topic><topic>Throughput</topic><toplevel>online_resources</toplevel><creatorcontrib>Liu, Yi</creatorcontrib><creatorcontrib>Zhou, Ruilin</creatorcontrib><creatorcontrib>Gan, Yuhang</creatorcontrib><creatorcontrib>Qian, Chen</creatorcontrib><collection>IEEE Electronic Library (IEL) Conference Proceedings</collection><collection>IEEE Proceedings Order Plan All Online (POP All Online) 1998-present by volume</collection><collection>IEEE Xplore All Conference Proceedings</collection><collection>IEEE Electronic Library (IEL)</collection><collection>IEEE Proceedings Order Plans (POP All) 1998-Present</collection></facets><delivery><delcategory>Remote Search Resource</delcategory><fulltext>fulltext_linktorsrc</fulltext></delivery><addata><au>Liu, Yi</au><au>Zhou, Ruilin</au><au>Gan, Yuhang</au><au>Qian, Chen</au><format>book</format><genre>proceeding</genre><ristype>CONF</ristype><atitle>SpotKV: Improving Read Throughput of KVS by I/O-Aware Cache and Adaptive Cuckoo Filters</atitle><btitle>2024 IEEE 17th International Conference on Cloud Computing (CLOUD)</btitle><stitle>CLOUD</stitle><date>2024-07-07</date><risdate>2024</risdate><spage>344</spage><epage>354</epage><pages>344-354</pages><eissn>2159-6190</eissn><eisbn>9798350368536</eisbn><coden>IEEPAD</coden><abstract>LSM tree based stores are a popular database design in modern persistent storage systems due to their efficient writes with sorted keys. However, this hierarchical log structure suffers from extensive read amplification because multiple disk accesses are required when it searches for a key. Recent optimizations of LSM trees propose caching hot keys to reduce I/Os mainly based on their access frequencies. However, our empirical studies show that keys are different in I/O costs, which should also be considered in the caching policy: caching key-value pairs with high I/O cost can effectively improve query latency. In addition, false positives incurred by the Bloom filters in LSM trees introduce a large overhead to access SSTables because the queried keys do not exist. In this work, we design and implement SpotKV, which resolves the above two problems in an LSM tree store by proposing two memory-efficient data structures, weighted Count- Min sketch for access and I/O-aware cache admission and dynamic-seed Cuckoo filters for eliminating false positives, to improve data lookup throughput. We implement SpotKV on Google's LevelDB vl.20. From extensive experimental evaluations, SpotKV achieves 1.2-3.0x read throughput while using the same or smaller memory, compared with several state- of-the-art LSM tree stores under the read-heavy workloads of the YCSB benchmarks.</abstract><pub>IEEE</pub><doi>10.1109/CLOUD62652.2024.00046</doi><tpages>11</tpages></addata></record>
fulltext fulltext_linktorsrc
identifier EISSN: 2159-6190
ispartof 2024 IEEE 17th International Conference on Cloud Computing (CLOUD), 2024, p.344-354
issn 2159-6190
language eng
recordid cdi_ieee_primary_10643922
source IEEE Xplore All Conference Series
subjects Benchmark testing
Caching
Cloud computing
Costs
Count-Min sketch
Cuckoo filter
Data structures
Filters
KV store
LSM tree
Throughput
title SpotKV: Improving Read Throughput of KVS by I/O-Aware Cache and Adaptive Cuckoo Filters
url http://sfxeu10.hosted.exlibrisgroup.com/loughborough?ctx_ver=Z39.88-2004&ctx_enc=info:ofi/enc:UTF-8&ctx_tim=2025-01-31T08%3A00%3A38IST&url_ver=Z39.88-2004&url_ctx_fmt=infofi/fmt:kev:mtx:ctx&rfr_id=info:sid/primo.exlibrisgroup.com:primo3-Article-ieee_CHZPO&rft_val_fmt=info:ofi/fmt:kev:mtx:book&rft.genre=proceeding&rft.atitle=SpotKV:%20Improving%20Read%20Throughput%20of%20KVS%20by%20I/O-Aware%20Cache%20and%20Adaptive%20Cuckoo%20Filters&rft.btitle=2024%20IEEE%2017th%20International%20Conference%20on%20Cloud%20Computing%20(CLOUD)&rft.au=Liu,%20Yi&rft.date=2024-07-07&rft.spage=344&rft.epage=354&rft.pages=344-354&rft.eissn=2159-6190&rft.coden=IEEPAD&rft_id=info:doi/10.1109/CLOUD62652.2024.00046&rft.eisbn=9798350368536&rft_dat=%3Cieee_CHZPO%3E10643922%3C/ieee_CHZPO%3E%3Cgrp_id%3Ecdi_FETCH-LOGICAL-i106t-614113d83b12739470807eb14439db2241c91c231970d9e1c512d1f0a6767b033%3C/grp_id%3E%3Coa%3E%3C/oa%3E%3Curl%3E%3C/url%3E&rft_id=info:oai/&rft_id=info:pmid/&rft_ieee_id=10643922&rfr_iscdi=true