Loading…
SpotKV: Improving Read Throughput of KVS by I/O-Aware Cache and Adaptive Cuckoo Filters
LSM tree based stores are a popular database design in modern persistent storage systems due to their efficient writes with sorted keys. However, this hierarchical log structure suffers from extensive read amplification because multiple disk accesses are required when it searches for a key. Recent o...
Saved in:
Main Authors: | , , , |
---|---|
Format: | Conference Proceeding |
Language: | English |
Subjects: | |
Online Access: | Request full text |
Tags: |
Add Tag
No Tags, Be the first to tag this record!
|
cited_by | |
---|---|
cites | |
container_end_page | 354 |
container_issue | |
container_start_page | 344 |
container_title | |
container_volume | |
creator | Liu, Yi Zhou, Ruilin Gan, Yuhang Qian, Chen |
description | LSM tree based stores are a popular database design in modern persistent storage systems due to their efficient writes with sorted keys. However, this hierarchical log structure suffers from extensive read amplification because multiple disk accesses are required when it searches for a key. Recent optimizations of LSM trees propose caching hot keys to reduce I/Os mainly based on their access frequencies. However, our empirical studies show that keys are different in I/O costs, which should also be considered in the caching policy: caching key-value pairs with high I/O cost can effectively improve query latency. In addition, false positives incurred by the Bloom filters in LSM trees introduce a large overhead to access SSTables because the queried keys do not exist. In this work, we design and implement SpotKV, which resolves the above two problems in an LSM tree store by proposing two memory-efficient data structures, weighted Count- Min sketch for access and I/O-aware cache admission and dynamic-seed Cuckoo filters for eliminating false positives, to improve data lookup throughput. We implement SpotKV on Google's LevelDB vl.20. From extensive experimental evaluations, SpotKV achieves 1.2-3.0x read throughput while using the same or smaller memory, compared with several state- of-the-art LSM tree stores under the read-heavy workloads of the YCSB benchmarks. |
doi_str_mv | 10.1109/CLOUD62652.2024.00046 |
format | conference_proceeding |
fullrecord | <record><control><sourceid>ieee_CHZPO</sourceid><recordid>TN_cdi_ieee_primary_10643922</recordid><sourceformat>XML</sourceformat><sourcesystem>PC</sourcesystem><ieee_id>10643922</ieee_id><sourcerecordid>10643922</sourcerecordid><originalsourceid>FETCH-LOGICAL-i106t-614113d83b12739470807eb14439db2241c91c231970d9e1c512d1f0a6767b033</originalsourceid><addsrcrecordid>eNotzttKw0AYBOBVECw1b6CwL5D2__eY9a5Eq6GFgj14WTbZTRNtm5BDpW9vQK8GhuFjCHlCmCCCmcbL1fZFMSXZhAETEwAQ6oYERpuIS-AqklzdkhFDaUKFBu5J0LZfwwwhkhL5iHyu66pb7J5pcqqb6lKeD_TDW0c3RVP1h6LuO1rldLFb0_RKk-kqnP3YxtPYZoWn9uzozNm6Ky9D1WffVUXn5bHzTftA7nJ7bH3wn2Oynb9u4vdwuXpL4tkyLBFUN5wSiNxFPEWmuREaItA-RSG4cSljAjODGeNoNDjjMZPIHOZglVY6Bc7H5PHPLb33-7opT7a57gd7ABjjvypNT9U</addsrcrecordid><sourcetype>Publisher</sourcetype><iscdi>true</iscdi><recordtype>conference_proceeding</recordtype></control><display><type>conference_proceeding</type><title>SpotKV: Improving Read Throughput of KVS by I/O-Aware Cache and Adaptive Cuckoo Filters</title><source>IEEE Xplore All Conference Series</source><creator>Liu, Yi ; Zhou, Ruilin ; Gan, Yuhang ; Qian, Chen</creator><creatorcontrib>Liu, Yi ; Zhou, Ruilin ; Gan, Yuhang ; Qian, Chen</creatorcontrib><description>LSM tree based stores are a popular database design in modern persistent storage systems due to their efficient writes with sorted keys. However, this hierarchical log structure suffers from extensive read amplification because multiple disk accesses are required when it searches for a key. Recent optimizations of LSM trees propose caching hot keys to reduce I/Os mainly based on their access frequencies. However, our empirical studies show that keys are different in I/O costs, which should also be considered in the caching policy: caching key-value pairs with high I/O cost can effectively improve query latency. In addition, false positives incurred by the Bloom filters in LSM trees introduce a large overhead to access SSTables because the queried keys do not exist. In this work, we design and implement SpotKV, which resolves the above two problems in an LSM tree store by proposing two memory-efficient data structures, weighted Count- Min sketch for access and I/O-aware cache admission and dynamic-seed Cuckoo filters for eliminating false positives, to improve data lookup throughput. We implement SpotKV on Google's LevelDB vl.20. From extensive experimental evaluations, SpotKV achieves 1.2-3.0x read throughput while using the same or smaller memory, compared with several state- of-the-art LSM tree stores under the read-heavy workloads of the YCSB benchmarks.</description><identifier>EISSN: 2159-6190</identifier><identifier>EISBN: 9798350368536</identifier><identifier>DOI: 10.1109/CLOUD62652.2024.00046</identifier><identifier>CODEN: IEEPAD</identifier><language>eng</language><publisher>IEEE</publisher><subject>Benchmark testing ; Caching ; Cloud computing ; Costs ; Count-Min sketch ; Cuckoo filter ; Data structures ; Filters ; KV store ; LSM tree ; Throughput</subject><ispartof>2024 IEEE 17th International Conference on Cloud Computing (CLOUD), 2024, p.344-354</ispartof><woscitedreferencessubscribed>false</woscitedreferencessubscribed></display><links><openurl>$$Topenurl_article</openurl><openurlfulltext>$$Topenurlfull_article</openurlfulltext><thumbnail>$$Tsyndetics_thumb_exl</thumbnail><linktohtml>$$Uhttps://ieeexplore.ieee.org/document/10643922$$EHTML$$P50$$Gieee$$H</linktohtml><link.rule.ids>309,310,776,780,785,786,27902,54530,54907</link.rule.ids><linktorsrc>$$Uhttps://ieeexplore.ieee.org/document/10643922$$EView_record_in_IEEE$$FView_record_in_$$GIEEE</linktorsrc></links><search><creatorcontrib>Liu, Yi</creatorcontrib><creatorcontrib>Zhou, Ruilin</creatorcontrib><creatorcontrib>Gan, Yuhang</creatorcontrib><creatorcontrib>Qian, Chen</creatorcontrib><title>SpotKV: Improving Read Throughput of KVS by I/O-Aware Cache and Adaptive Cuckoo Filters</title><title>2024 IEEE 17th International Conference on Cloud Computing (CLOUD)</title><addtitle>CLOUD</addtitle><description>LSM tree based stores are a popular database design in modern persistent storage systems due to their efficient writes with sorted keys. However, this hierarchical log structure suffers from extensive read amplification because multiple disk accesses are required when it searches for a key. Recent optimizations of LSM trees propose caching hot keys to reduce I/Os mainly based on their access frequencies. However, our empirical studies show that keys are different in I/O costs, which should also be considered in the caching policy: caching key-value pairs with high I/O cost can effectively improve query latency. In addition, false positives incurred by the Bloom filters in LSM trees introduce a large overhead to access SSTables because the queried keys do not exist. In this work, we design and implement SpotKV, which resolves the above two problems in an LSM tree store by proposing two memory-efficient data structures, weighted Count- Min sketch for access and I/O-aware cache admission and dynamic-seed Cuckoo filters for eliminating false positives, to improve data lookup throughput. We implement SpotKV on Google's LevelDB vl.20. From extensive experimental evaluations, SpotKV achieves 1.2-3.0x read throughput while using the same or smaller memory, compared with several state- of-the-art LSM tree stores under the read-heavy workloads of the YCSB benchmarks.</description><subject>Benchmark testing</subject><subject>Caching</subject><subject>Cloud computing</subject><subject>Costs</subject><subject>Count-Min sketch</subject><subject>Cuckoo filter</subject><subject>Data structures</subject><subject>Filters</subject><subject>KV store</subject><subject>LSM tree</subject><subject>Throughput</subject><issn>2159-6190</issn><isbn>9798350368536</isbn><fulltext>true</fulltext><rsrctype>conference_proceeding</rsrctype><creationdate>2024</creationdate><recordtype>conference_proceeding</recordtype><sourceid>6IE</sourceid><recordid>eNotzttKw0AYBOBVECw1b6CwL5D2__eY9a5Eq6GFgj14WTbZTRNtm5BDpW9vQK8GhuFjCHlCmCCCmcbL1fZFMSXZhAETEwAQ6oYERpuIS-AqklzdkhFDaUKFBu5J0LZfwwwhkhL5iHyu66pb7J5pcqqb6lKeD_TDW0c3RVP1h6LuO1rldLFb0_RKk-kqnP3YxtPYZoWn9uzozNm6Ky9D1WffVUXn5bHzTftA7nJ7bH3wn2Oynb9u4vdwuXpL4tkyLBFUN5wSiNxFPEWmuREaItA-RSG4cSljAjODGeNoNDjjMZPIHOZglVY6Bc7H5PHPLb33-7opT7a57gd7ABjjvypNT9U</recordid><startdate>20240707</startdate><enddate>20240707</enddate><creator>Liu, Yi</creator><creator>Zhou, Ruilin</creator><creator>Gan, Yuhang</creator><creator>Qian, Chen</creator><general>IEEE</general><scope>6IE</scope><scope>6IL</scope><scope>CBEJK</scope><scope>RIE</scope><scope>RIL</scope></search><sort><creationdate>20240707</creationdate><title>SpotKV: Improving Read Throughput of KVS by I/O-Aware Cache and Adaptive Cuckoo Filters</title><author>Liu, Yi ; Zhou, Ruilin ; Gan, Yuhang ; Qian, Chen</author></sort><facets><frbrtype>5</frbrtype><frbrgroupid>cdi_FETCH-LOGICAL-i106t-614113d83b12739470807eb14439db2241c91c231970d9e1c512d1f0a6767b033</frbrgroupid><rsrctype>conference_proceedings</rsrctype><prefilter>conference_proceedings</prefilter><language>eng</language><creationdate>2024</creationdate><topic>Benchmark testing</topic><topic>Caching</topic><topic>Cloud computing</topic><topic>Costs</topic><topic>Count-Min sketch</topic><topic>Cuckoo filter</topic><topic>Data structures</topic><topic>Filters</topic><topic>KV store</topic><topic>LSM tree</topic><topic>Throughput</topic><toplevel>online_resources</toplevel><creatorcontrib>Liu, Yi</creatorcontrib><creatorcontrib>Zhou, Ruilin</creatorcontrib><creatorcontrib>Gan, Yuhang</creatorcontrib><creatorcontrib>Qian, Chen</creatorcontrib><collection>IEEE Electronic Library (IEL) Conference Proceedings</collection><collection>IEEE Proceedings Order Plan All Online (POP All Online) 1998-present by volume</collection><collection>IEEE Xplore All Conference Proceedings</collection><collection>IEEE Electronic Library (IEL)</collection><collection>IEEE Proceedings Order Plans (POP All) 1998-Present</collection></facets><delivery><delcategory>Remote Search Resource</delcategory><fulltext>fulltext_linktorsrc</fulltext></delivery><addata><au>Liu, Yi</au><au>Zhou, Ruilin</au><au>Gan, Yuhang</au><au>Qian, Chen</au><format>book</format><genre>proceeding</genre><ristype>CONF</ristype><atitle>SpotKV: Improving Read Throughput of KVS by I/O-Aware Cache and Adaptive Cuckoo Filters</atitle><btitle>2024 IEEE 17th International Conference on Cloud Computing (CLOUD)</btitle><stitle>CLOUD</stitle><date>2024-07-07</date><risdate>2024</risdate><spage>344</spage><epage>354</epage><pages>344-354</pages><eissn>2159-6190</eissn><eisbn>9798350368536</eisbn><coden>IEEPAD</coden><abstract>LSM tree based stores are a popular database design in modern persistent storage systems due to their efficient writes with sorted keys. However, this hierarchical log structure suffers from extensive read amplification because multiple disk accesses are required when it searches for a key. Recent optimizations of LSM trees propose caching hot keys to reduce I/Os mainly based on their access frequencies. However, our empirical studies show that keys are different in I/O costs, which should also be considered in the caching policy: caching key-value pairs with high I/O cost can effectively improve query latency. In addition, false positives incurred by the Bloom filters in LSM trees introduce a large overhead to access SSTables because the queried keys do not exist. In this work, we design and implement SpotKV, which resolves the above two problems in an LSM tree store by proposing two memory-efficient data structures, weighted Count- Min sketch for access and I/O-aware cache admission and dynamic-seed Cuckoo filters for eliminating false positives, to improve data lookup throughput. We implement SpotKV on Google's LevelDB vl.20. From extensive experimental evaluations, SpotKV achieves 1.2-3.0x read throughput while using the same or smaller memory, compared with several state- of-the-art LSM tree stores under the read-heavy workloads of the YCSB benchmarks.</abstract><pub>IEEE</pub><doi>10.1109/CLOUD62652.2024.00046</doi><tpages>11</tpages></addata></record> |
fulltext | fulltext_linktorsrc |
identifier | EISSN: 2159-6190 |
ispartof | 2024 IEEE 17th International Conference on Cloud Computing (CLOUD), 2024, p.344-354 |
issn | 2159-6190 |
language | eng |
recordid | cdi_ieee_primary_10643922 |
source | IEEE Xplore All Conference Series |
subjects | Benchmark testing Caching Cloud computing Costs Count-Min sketch Cuckoo filter Data structures Filters KV store LSM tree Throughput |
title | SpotKV: Improving Read Throughput of KVS by I/O-Aware Cache and Adaptive Cuckoo Filters |
url | http://sfxeu10.hosted.exlibrisgroup.com/loughborough?ctx_ver=Z39.88-2004&ctx_enc=info:ofi/enc:UTF-8&ctx_tim=2025-01-31T08%3A00%3A38IST&url_ver=Z39.88-2004&url_ctx_fmt=infofi/fmt:kev:mtx:ctx&rfr_id=info:sid/primo.exlibrisgroup.com:primo3-Article-ieee_CHZPO&rft_val_fmt=info:ofi/fmt:kev:mtx:book&rft.genre=proceeding&rft.atitle=SpotKV:%20Improving%20Read%20Throughput%20of%20KVS%20by%20I/O-Aware%20Cache%20and%20Adaptive%20Cuckoo%20Filters&rft.btitle=2024%20IEEE%2017th%20International%20Conference%20on%20Cloud%20Computing%20(CLOUD)&rft.au=Liu,%20Yi&rft.date=2024-07-07&rft.spage=344&rft.epage=354&rft.pages=344-354&rft.eissn=2159-6190&rft.coden=IEEPAD&rft_id=info:doi/10.1109/CLOUD62652.2024.00046&rft.eisbn=9798350368536&rft_dat=%3Cieee_CHZPO%3E10643922%3C/ieee_CHZPO%3E%3Cgrp_id%3Ecdi_FETCH-LOGICAL-i106t-614113d83b12739470807eb14439db2241c91c231970d9e1c512d1f0a6767b033%3C/grp_id%3E%3Coa%3E%3C/oa%3E%3Curl%3E%3C/url%3E&rft_id=info:oai/&rft_id=info:pmid/&rft_ieee_id=10643922&rfr_iscdi=true |