Loading…

SpotKV: Improving Read Throughput of KVS by I/O-Aware Cache and Adaptive Cuckoo Filters

LSM tree based stores are a popular database design in modern persistent storage systems due to their efficient writes with sorted keys. However, this hierarchical log structure suffers from extensive read amplification because multiple disk accesses are required when it searches for a key. Recent o...

Full description

Saved in:

Bibliographic Details
Main Authors:	Liu, Yi, Zhou, Ruilin, Gan, Yuhang, Qian, Chen
Format:	Conference Proceeding
Language:	English
Subjects:	Benchmark testing Caching Cloud computing Costs Count-Min sketch Cuckoo filter Data structures Filters KV store LSM tree Throughput
Online Access:	Request full text
Tags:	Add Tag No Tags, Be the first to tag this record!

cited_by
cites
container_end_page	354
container_issue
container_start_page	344
container_title
container_volume
creator	Liu, Yi Zhou, Ruilin Gan, Yuhang Qian, Chen
description	LSM tree based stores are a popular database design in modern persistent storage systems due to their efficient writes with sorted keys. However, this hierarchical log structure suffers from extensive read amplification because multiple disk accesses are required when it searches for a key. Recent optimizations of LSM trees propose caching hot keys to reduce I/Os mainly based on their access frequencies. However, our empirical studies show that keys are different in I/O costs, which should also be considered in the caching policy: caching key-value pairs with high I/O cost can effectively improve query latency. In addition, false positives incurred by the Bloom filters in LSM trees introduce a large overhead to access SSTables because the queried keys do not exist. In this work, we design and implement SpotKV, which resolves the above two problems in an LSM tree store by proposing two memory-efficient data structures, weighted Count- Min sketch for access and I/O-aware cache admission and dynamic-seed Cuckoo filters for eliminating false positives, to improve data lookup throughput. We implement SpotKV on Google's LevelDB vl.20. From extensive experimental evaluations, SpotKV achieves 1.2-3.0x read throughput while using the same or smaller memory, compared with several state- of-the-art LSM tree stores under the read-heavy workloads of the YCSB benchmarks.
doi_str_mv	10.1109/CLOUD62652.2024.00046
format	conference_proceeding
fullrecord	<record><control><sourceid>ieee_CHZPO</sourceid><recordid>TN_cdi_ieee_primary_10643922</recordid><sourceformat>XML</sourceformat><sourcesystem>PC</sourcesystem><ieee_id>10643922</ieee_id><sourcerecordid>10643922</sourcerecordid><originalsourceid>FETCH-LOGICAL-i106t-614113d83b12739470807eb14439db2241c91c231970d9e1c512d1f0a6767b033</originalsourceid><addsrcrecordid>eNotzttKw0AYBOBVECw1b6CwL5D2__eY9a5Eq6GFgj14WTbZTRNtm5BDpW9vQK8GhuFjCHlCmCCCmcbL1fZFMSXZhAETEwAQ6oYERpuIS-AqklzdkhFDaUKFBu5J0LZfwwwhkhL5iHyu66pb7J5pcqqb6lKeD_TDW0c3RVP1h6LuO1rldLFb0_RKk-kqnP3YxtPYZoWn9uzozNm6Ky9D1WffVUXn5bHzTftA7nJ7bH3wn2Oynb9u4vdwuXpL4tkyLBFUN5wSiNxFPEWmuREaItA-RSG4cSljAjODGeNoNDjjMZPIHOZglVY6Bc7H5PHPLb33-7opT7a57gd7ABjjvypNT9U</addsrcrecordid><sourcetype>Publisher</sourcetype><iscdi>true</iscdi><recordtype>conference_proceeding</recordtype></control><display><type>conference_proceeding</type><title>SpotKV: Improving Read Throughput of KVS by I/O-Aware Cache and Adaptive Cuckoo Filters</title><source>IEEE Xplore All Conference Series</source><creator>Liu, Yi ; Zhou, Ruilin ; Gan, Yuhang ; Qian, Chen</creator><creatorcontrib>Liu, Yi ; Zhou, Ruilin ; Gan, Yuhang ; Qian, Chen</creatorcontrib><description>LSM tree based stores are a popular database design in modern persistent storage systems due to their efficient writes with sorted keys. However, this hierarchical log structure suffers from extensive read amplification because multiple disk accesses are required when it searches for a key. Recent optimizations of LSM trees propose caching hot keys to reduce I/Os mainly based on their access frequencies. However, our empirical studies show that keys are different in I/O costs, which should also be considered in the caching policy: caching key-value pairs with high I/O cost can effectively improve query latency. In addition, false positives incurred by the Bloom filters in LSM trees introduce a large overhead to access SSTables because the queried keys do not exist. In this work, we design and implement SpotKV, which resolves the above two problems in an LSM tree store by proposing two memory-efficient data structures, weighted Count- Min sketch for access and I/O-aware cache admission and dynamic-seed Cuckoo filters for eliminating false positives, to improve data lookup throughput. We implement SpotKV on Google's LevelDB vl.20. From extensive experimental evaluations, SpotKV achieves 1.2-3.0x read throughput while using the same or smaller memory, compared with several state- of-the-art LSM tree stores under the read-heavy workloads of the YCSB benchmarks.</description><identifier>EISSN: 2159-6190</identifier><identifier>EISBN: 9798350368536</identifier><identifier>DOI: 10.1109/CLOUD62652.2024.00046</identifier><identifier>CODEN: IEEPAD</identifier><language>eng</language><publisher>IEEE</publisher><subject>Benchmark testing ; Caching ; Cloud computing ; Costs ; Count-Min sketch ; Cuckoo filter ; Data structures ; Filters ; KV store ; LSM tree ; Throughput</subject><ispartof>2024 IEEE 17th International Conference on Cloud Computing (CLOUD), 2024, p.344-354</ispartof><woscitedreferencessubscribed>false</woscitedreferencessubscribed></display><links><openurl>$$Topenurl_article</openurl><openurlfulltext>$$Topenurlfull_article</openurlfulltext><thumbnail>$$Tsyndetics_thumb_exl</thumbnail><linktohtml>$$Uhttps://ieeexplore.ieee.org/document/10643922$$EHTML$$P50$$Gieee$$H</linktohtml><link.rule.ids>309,310,776,780,785,786,27902,54530,54907</link.rule.ids><linktorsrc>$$Uhttps://ieeexplore.ieee.org/document/10643922$$EView_record_in_IEEE$$FView_record_in_$$GIEEE</linktorsrc></links><search><creatorcontrib>Liu, Yi</creatorcontrib><creatorcontrib>Zhou, Ruilin</creatorcontrib><creatorcontrib>Gan, Yuhang</creatorcontrib><creatorcontrib>Qian, Chen</creatorcontrib><title>SpotKV: Improving Read Throughput of KVS by I/O-Aware Cache and Adaptive Cuckoo Filters</title><title>2024 IEEE 17th International Conference on Cloud Computing (CLOUD)</title><addtitle>CLOUD</addtitle><description>LSM tree based stores are a popular database design in modern persistent storage systems due to their efficient writes with sorted keys. However, this hierarchical log structure suffers from extensive read amplification because multiple disk accesses are required when it searches for a key. Recent optimizations of LSM trees propose caching hot keys to reduce I/Os mainly based on their access frequencies. However, our empirical studies show that keys are different in I/O costs, which should also be considered in the caching policy: caching key-value pairs with high I/O cost can effectively improve query latency. In addition, false positives incurred by the Bloom filters in LSM trees introduce a large overhead to access SSTables because the queried keys do not exist. In this work, we design and implement SpotKV, which resolves the above two problems in an LSM tree store by proposing two memory-efficient data structures, weighted Count- Min sketch for access and I/O-aware cache admission and dynamic-seed Cuckoo filters for eliminating false positives, to improve data lookup throughput. We implement SpotKV on Google's LevelDB vl.20. From extensive experimental evaluations, SpotKV achieves 1.2-3.0x read throughput while using the same or smaller memory, compared with several state- of-the-art LSM tree stores under the read-heavy workloads of the YCSB benchmarks.</description><subject>Benchmark testing</subject><subject>Caching</subject><subject>Cloud computing</subject><subject>Costs</subject><subject>Count-Min sketch</subject><subject>Cuckoo filter</subject><subject>Data structures</subject><subject>Filters</subject><subject>KV store</subject><subject>LSM tree</subject><subject>Throughput</subject><issn>2159-6190</issn><isbn>9798350368536</isbn><fulltext>true</fulltext><rsrctype>conference_proceeding</rsrctype><creationdate>2024</creationdate><recordtype>conference_proceeding</recordtype><sourceid>6IE</sourceid><recordid>eNotzttKw0AYBOBVECw1b6CwL5D2__eY9a5Eq6GFgj14WTbZTRNtm5BDpW9vQK8GhuFjCHlCmCCCmcbL1fZFMSXZhAETEwAQ6oYERpuIS-AqklzdkhFDaUKFBu5J0LZfwwwhkhL5iHyu66pb7J5pcqqb6lKeD_TDW0c3RVP1h6LuO1rldLFb0_RKk-kqnP3YxtPYZoWn9uzozNm6Ky9D1WffVUXn5bHzTftA7nJ7bH3wn2Oynb9u4vdwuXpL4tkyLBFUN5wSiNxFPEWmuREaItA-RSG4cSljAjODGeNoNDjjMZPIHOZglVY6Bc7H5PHPLb33-7opT7a57gd7ABjjvypNT9U</recordid><startdate>20240707</startdate><enddate>20240707</enddate><creator>Liu, Yi</creator><creator>Zhou, Ruilin</creator><creator>Gan, Yuhang</creator><creator>Qian, Chen</creator><general>IEEE</general><scope>6IE</scope><scope>6IL</scope><scope>CBEJK</scope><scope>RIE</scope><scope>RIL</scope></search><sort><creationdate>20240707</creationdate><title>SpotKV: Improving Read Throughput of KVS by I/O-Aware Cache and Adaptive Cuckoo Filters</title><author>Liu, Yi ; Zhou, Ruilin ; Gan, Yuhang ; Qian, Chen</author></sort><facets><frbrtype>5</frbrtype><frbrgroupid>cdi_FETCH-LOGICAL-i106t-614113d83b12739470807eb14439db2241c91c231970d9e1c512d1f0a6767b033</frbrgroupid><rsrctype>conference_proceedings</rsrctype><prefilter>conference_proceedings</prefilter><language>eng</language><creationdate>2024</creationdate><topic>Benchmark testing</topic><topic>Caching</topic><topic>Cloud computing</topic><topic>Costs</topic><topic>Count-Min sketch</topic><topic>Cuckoo filter</topic><topic>Data structures</topic><topic>Filters</topic><topic>KV store</topic><topic>LSM tree</topic><topic>Throughput</topic><toplevel>online_resources</toplevel><creatorcontrib>Liu, Yi</creatorcontrib><creatorcontrib>Zhou, Ruilin</creatorcontrib><creatorcontrib>Gan, Yuhang</creatorcontrib><creatorcontrib>Qian, Chen</creatorcontrib><collection>IEEE Electronic Library (IEL) Conference Proceedings</collection><collection>IEEE Proceedings Order Plan All Online (POP All Online) 1998-present by volume</collection><collection>IEEE Xplore All Conference Proceedings</collection><collection>IEEE Electronic Library (IEL)</collection><collection>IEEE Proceedings Order Plans (POP All) 1998-Present</collection></facets><delivery><delcategory>Remote Search Resource</delcategory><fulltext>fulltext_linktorsrc</fulltext></delivery><addata><au>Liu, Yi</au><au>Zhou, Ruilin</au><au>Gan, Yuhang</au><au>Qian, Chen</au><format>book</format><genre>proceeding</genre><ristype>CONF</ristype><atitle>SpotKV: Improving Read Throughput of KVS by I/O-Aware Cache and Adaptive Cuckoo Filters</atitle><btitle>2024 IEEE 17th International Conference on Cloud Computing (CLOUD)</btitle><stitle>CLOUD</stitle><date>2024-07-07</date><risdate>2024</risdate><spage>344</spage><epage>354</epage><pages>344-354</pages><eissn>2159-6190</eissn><eisbn>9798350368536</eisbn><coden>IEEPAD</coden><abstract>LSM tree based stores are a popular database design in modern persistent storage systems due to their efficient writes with sorted keys. However, this hierarchical log structure suffers from extensive read amplification because multiple disk accesses are required when it searches for a key. Recent optimizations of LSM trees propose caching hot keys to reduce I/Os mainly based on their access frequencies. However, our empirical studies show that keys are different in I/O costs, which should also be considered in the caching policy: caching key-value pairs with high I/O cost can effectively improve query latency. In addition, false positives incurred by the Bloom filters in LSM trees introduce a large overhead to access SSTables because the queried keys do not exist. In this work, we design and implement SpotKV, which resolves the above two problems in an LSM tree store by proposing two memory-efficient data structures, weighted Count- Min sketch for access and I/O-aware cache admission and dynamic-seed Cuckoo filters for eliminating false positives, to improve data lookup throughput. We implement SpotKV on Google's LevelDB vl.20. From extensive experimental evaluations, SpotKV achieves 1.2-3.0x read throughput while using the same or smaller memory, compared with several state- of-the-art LSM tree stores under the read-heavy workloads of the YCSB benchmarks.</abstract><pub>IEEE</pub><doi>10.1109/CLOUD62652.2024.00046</doi><tpages>11</tpages></addata></record>
fulltext	fulltext_linktorsrc
identifier	EISSN: 2159-6190
ispartof	2024 IEEE 17th International Conference on Cloud Computing (CLOUD), 2024, p.344-354
issn	2159-6190
language	eng
recordid	cdi_ieee_primary_10643922
source	IEEE Xplore All Conference Series
subjects	Benchmark testing Caching Cloud computing Costs Count-Min sketch Cuckoo filter Data structures Filters KV store LSM tree Throughput
title	SpotKV: Improving Read Throughput of KVS by I/O-Aware Cache and Adaptive Cuckoo Filters
url	http://sfxeu10.hosted.exlibrisgroup.com/loughborough?ctx_ver=Z39.88-2004&ctx_enc=info:ofi/enc:UTF-8&ctx_tim=2025-01-31T08%3A00%3A38IST&url_ver=Z39.88-2004&url_ctx_fmt=infofi/fmt:kev:mtx:ctx&rfr_id=info:sid/primo.exlibrisgroup.com:primo3-Article-ieee_CHZPO&rft_val_fmt=info:ofi/fmt:kev:mtx:book&rft.genre=proceeding&rft.atitle=SpotKV:%20Improving%20Read%20Throughput%20of%20KVS%20by%20I/O-Aware%20Cache%20and%20Adaptive%20Cuckoo%20Filters&rft.btitle=2024%20IEEE%2017th%20International%20Conference%20on%20Cloud%20Computing%20(CLOUD)&rft.au=Liu,%20Yi&rft.date=2024-07-07&rft.spage=344&rft.epage=354&rft.pages=344-354&rft.eissn=2159-6190&rft.coden=IEEPAD&rft_id=info:doi/10.1109/CLOUD62652.2024.00046&rft.eisbn=9798350368536&rft_dat=%3Cieee_CHZPO%3E10643922%3C/ieee_CHZPO%3E%3Cgrp_id%3Ecdi_FETCH-LOGICAL-i106t-614113d83b12739470807eb14439db2241c91c231970d9e1c512d1f0a6767b033%3C/grp_id%3E%3Coa%3E%3C/oa%3E%3Curl%3E%3C/url%3E&rft_id=info:oai/&rft_id=info:pmid/&rft_ieee_id=10643922&rfr_iscdi=true