Loading…
Uniformity and Independence of H3 Hash Functions for Bloom Filters
In this paper, we investigate the effects of violating the conditions of hash function uniformity and/or independence on the false positive probability of Bloom Filters (BF). To this end, we focus on hash functions of the H3 family with a partitioned memory organization for fast hardware implementat...
Saved in:
Published in: | IEEE transactions on computers 2024-08, Vol.73 (8), p.1913-1923 |
---|---|
Main Authors: | , |
Format: | Article |
Language: | English |
Subjects: | |
Online Access: | Get full text |
Tags: |
Add Tag
No Tags, Be the first to tag this record!
|
cited_by | |
---|---|
cites | |
container_end_page | 1923 |
container_issue | 8 |
container_start_page | 1913 |
container_title | IEEE transactions on computers |
container_volume | 73 |
creator | Koltuk, Furkan Schmidt, Ece Guran |
description | In this paper, we investigate the effects of violating the conditions of hash function uniformity and/or independence on the false positive probability of Bloom Filters (BF). To this end, we focus on hash functions of the H3 family with a partitioned memory organization for fast hardware implementations of BFs. We first introduce a dependence metric that quantifies hash function uniformity and independence. We then state and prove the necessary and sufficient conditions on the BF parameters for constructing uniform and independent hash functions. Finally, we derive an analytical expression for the exact false positive probability of a BF with hash functions that are not necessarily uniform or independent. We verify our expression with a hardware test bench and explore the effects of losing uniformity and independence through an experimental study that systematically sweeps different dependence metric values and numbers of hash functions. We demonstrate the effects of violating hash function uniformity and independence on the stated target false positive probability for selected previous works in the literature. As an important finding, we show that uniformity of individual hash functions is essential, whereas limited dependencies between hash functions can be tolerated without a negative effect on the false positive probability. |
doi_str_mv | 10.1109/TC.2024.3398426 |
format | article |
fullrecord | <record><control><sourceid>proquest_cross</sourceid><recordid>TN_cdi_crossref_primary_10_1109_TC_2024_3398426</recordid><sourceformat>XML</sourceformat><sourcesystem>PC</sourcesystem><ieee_id>10527416</ieee_id><sourcerecordid>3078087045</sourcerecordid><originalsourceid>FETCH-LOGICAL-c244t-c845fe82ad7717e6f483eec0893abe9936da7792c873f7b4084bc202951d6afa3</originalsourceid><addsrcrecordid>eNpNkD1vwjAQhq2qlUpp5y4dLHUOnD8S22OJSkFC6gKzZZyzGgQxtcPAv28QDF3ulud9T_cQ8spgwhiY6bqecOByIoTRkld3ZMTKUhXGlNU9GQEwXRgh4ZE85bwDgIqDGZHZpmtDTIe2P1PXNXTZNXjEYXQeaQx0IejC5R86P3W-b2OX6UDT2T7GA523-x5TfiYPwe0zvtz2mGzmn-t6Uay-v5b1x6rwXMq-8FqWATV3jVJMYRWkFogetBFui8aIqnFKGe61EkFtJWi59cNHpmRN5YITY_J-7T2m-HvC3NtdPKVuOGkFKA1agSwHanqlfIo5Jwz2mNqDS2fLwF5E2XVtL6LsTdSQeLsmWkT8R5dcSVaJP43IYpI</addsrcrecordid><sourcetype>Aggregation Database</sourcetype><iscdi>true</iscdi><recordtype>article</recordtype><pqid>3078087045</pqid></control><display><type>article</type><title>Uniformity and Independence of H3 Hash Functions for Bloom Filters</title><source>IEEE Electronic Library (IEL) Journals</source><creator>Koltuk, Furkan ; Schmidt, Ece Guran</creator><creatorcontrib>Koltuk, Furkan ; Schmidt, Ece Guran</creatorcontrib><description>In this paper, we investigate the effects of violating the conditions of hash function uniformity and/or independence on the false positive probability of Bloom Filters (BF). To this end, we focus on hash functions of the H3 family with a partitioned memory organization for fast hardware implementations of BFs. We first introduce a dependence metric that quantifies hash function uniformity and independence. We then state and prove the necessary and sufficient conditions on the BF parameters for constructing uniform and independent hash functions. Finally, we derive an analytical expression for the exact false positive probability of a BF with hash functions that are not necessarily uniform or independent. We verify our expression with a hardware test bench and explore the effects of losing uniformity and independence through an experimental study that systematically sweeps different dependence metric values and numbers of hash functions. We demonstrate the effects of violating hash function uniformity and independence on the stated target false positive probability for selected previous works in the literature. As an important finding, we show that uniformity of individual hash functions is essential, whereas limited dependencies between hash functions can be tolerated without a negative effect on the false positive probability.</description><identifier>ISSN: 0018-9340</identifier><identifier>EISSN: 1557-9956</identifier><identifier>DOI: 10.1109/TC.2024.3398426</identifier><identifier>CODEN: ITCOB4</identifier><language>eng</language><publisher>New York: IEEE</publisher><subject>Bloom filters ; Computers ; Filters ; Hardware ; hash function uniformity and independence ; Hash functions ; Memory management ; Testing ; universal hash functions ; vector subspaces in GF ; Vectors</subject><ispartof>IEEE transactions on computers, 2024-08, Vol.73 (8), p.1913-1923</ispartof><rights>Copyright The Institute of Electrical and Electronics Engineers, Inc. (IEEE) 2024</rights><lds50>peer_reviewed</lds50><woscitedreferencessubscribed>false</woscitedreferencessubscribed><orcidid>0000-0003-1807-4502 ; 0000-0002-4062-389X</orcidid></display><links><openurl>$$Topenurl_article</openurl><openurlfulltext>$$Topenurlfull_article</openurlfulltext><thumbnail>$$Tsyndetics_thumb_exl</thumbnail><linktohtml>$$Uhttps://ieeexplore.ieee.org/document/10527416$$EHTML$$P50$$Gieee$$H</linktohtml><link.rule.ids>314,776,780,27901,27902,54771</link.rule.ids></links><search><creatorcontrib>Koltuk, Furkan</creatorcontrib><creatorcontrib>Schmidt, Ece Guran</creatorcontrib><title>Uniformity and Independence of H3 Hash Functions for Bloom Filters</title><title>IEEE transactions on computers</title><addtitle>TC</addtitle><description>In this paper, we investigate the effects of violating the conditions of hash function uniformity and/or independence on the false positive probability of Bloom Filters (BF). To this end, we focus on hash functions of the H3 family with a partitioned memory organization for fast hardware implementations of BFs. We first introduce a dependence metric that quantifies hash function uniformity and independence. We then state and prove the necessary and sufficient conditions on the BF parameters for constructing uniform and independent hash functions. Finally, we derive an analytical expression for the exact false positive probability of a BF with hash functions that are not necessarily uniform or independent. We verify our expression with a hardware test bench and explore the effects of losing uniformity and independence through an experimental study that systematically sweeps different dependence metric values and numbers of hash functions. We demonstrate the effects of violating hash function uniformity and independence on the stated target false positive probability for selected previous works in the literature. As an important finding, we show that uniformity of individual hash functions is essential, whereas limited dependencies between hash functions can be tolerated without a negative effect on the false positive probability.</description><subject>Bloom filters</subject><subject>Computers</subject><subject>Filters</subject><subject>Hardware</subject><subject>hash function uniformity and independence</subject><subject>Hash functions</subject><subject>Memory management</subject><subject>Testing</subject><subject>universal hash functions</subject><subject>vector subspaces in GF</subject><subject>Vectors</subject><issn>0018-9340</issn><issn>1557-9956</issn><fulltext>true</fulltext><rsrctype>article</rsrctype><creationdate>2024</creationdate><recordtype>article</recordtype><recordid>eNpNkD1vwjAQhq2qlUpp5y4dLHUOnD8S22OJSkFC6gKzZZyzGgQxtcPAv28QDF3ulud9T_cQ8spgwhiY6bqecOByIoTRkld3ZMTKUhXGlNU9GQEwXRgh4ZE85bwDgIqDGZHZpmtDTIe2P1PXNXTZNXjEYXQeaQx0IejC5R86P3W-b2OX6UDT2T7GA523-x5TfiYPwe0zvtz2mGzmn-t6Uay-v5b1x6rwXMq-8FqWATV3jVJMYRWkFogetBFui8aIqnFKGe61EkFtJWi59cNHpmRN5YITY_J-7T2m-HvC3NtdPKVuOGkFKA1agSwHanqlfIo5Jwz2mNqDS2fLwF5E2XVtL6LsTdSQeLsmWkT8R5dcSVaJP43IYpI</recordid><startdate>20240801</startdate><enddate>20240801</enddate><creator>Koltuk, Furkan</creator><creator>Schmidt, Ece Guran</creator><general>IEEE</general><general>The Institute of Electrical and Electronics Engineers, Inc. (IEEE)</general><scope>97E</scope><scope>RIA</scope><scope>RIE</scope><scope>AAYXX</scope><scope>CITATION</scope><scope>7SC</scope><scope>7SP</scope><scope>8FD</scope><scope>JQ2</scope><scope>L7M</scope><scope>L~C</scope><scope>L~D</scope><orcidid>https://orcid.org/0000-0003-1807-4502</orcidid><orcidid>https://orcid.org/0000-0002-4062-389X</orcidid></search><sort><creationdate>20240801</creationdate><title>Uniformity and Independence of H3 Hash Functions for Bloom Filters</title><author>Koltuk, Furkan ; Schmidt, Ece Guran</author></sort><facets><frbrtype>5</frbrtype><frbrgroupid>cdi_FETCH-LOGICAL-c244t-c845fe82ad7717e6f483eec0893abe9936da7792c873f7b4084bc202951d6afa3</frbrgroupid><rsrctype>articles</rsrctype><prefilter>articles</prefilter><language>eng</language><creationdate>2024</creationdate><topic>Bloom filters</topic><topic>Computers</topic><topic>Filters</topic><topic>Hardware</topic><topic>hash function uniformity and independence</topic><topic>Hash functions</topic><topic>Memory management</topic><topic>Testing</topic><topic>universal hash functions</topic><topic>vector subspaces in GF</topic><topic>Vectors</topic><toplevel>peer_reviewed</toplevel><toplevel>online_resources</toplevel><creatorcontrib>Koltuk, Furkan</creatorcontrib><creatorcontrib>Schmidt, Ece Guran</creatorcontrib><collection>IEEE All-Society Periodicals Package (ASPP) 2005-present</collection><collection>IEEE All-Society Periodicals Package (ASPP) 1998–Present</collection><collection>IEEE/IET Electronic Library</collection><collection>CrossRef</collection><collection>Computer and Information Systems Abstracts</collection><collection>Electronics & Communications Abstracts</collection><collection>Technology Research Database</collection><collection>ProQuest Computer Science Collection</collection><collection>Advanced Technologies Database with Aerospace</collection><collection>Computer and Information Systems Abstracts Academic</collection><collection>Computer and Information Systems Abstracts Professional</collection><jtitle>IEEE transactions on computers</jtitle></facets><delivery><delcategory>Remote Search Resource</delcategory><fulltext>fulltext</fulltext></delivery><addata><au>Koltuk, Furkan</au><au>Schmidt, Ece Guran</au><format>journal</format><genre>article</genre><ristype>JOUR</ristype><atitle>Uniformity and Independence of H3 Hash Functions for Bloom Filters</atitle><jtitle>IEEE transactions on computers</jtitle><stitle>TC</stitle><date>2024-08-01</date><risdate>2024</risdate><volume>73</volume><issue>8</issue><spage>1913</spage><epage>1923</epage><pages>1913-1923</pages><issn>0018-9340</issn><eissn>1557-9956</eissn><coden>ITCOB4</coden><abstract>In this paper, we investigate the effects of violating the conditions of hash function uniformity and/or independence on the false positive probability of Bloom Filters (BF). To this end, we focus on hash functions of the H3 family with a partitioned memory organization for fast hardware implementations of BFs. We first introduce a dependence metric that quantifies hash function uniformity and independence. We then state and prove the necessary and sufficient conditions on the BF parameters for constructing uniform and independent hash functions. Finally, we derive an analytical expression for the exact false positive probability of a BF with hash functions that are not necessarily uniform or independent. We verify our expression with a hardware test bench and explore the effects of losing uniformity and independence through an experimental study that systematically sweeps different dependence metric values and numbers of hash functions. We demonstrate the effects of violating hash function uniformity and independence on the stated target false positive probability for selected previous works in the literature. As an important finding, we show that uniformity of individual hash functions is essential, whereas limited dependencies between hash functions can be tolerated without a negative effect on the false positive probability.</abstract><cop>New York</cop><pub>IEEE</pub><doi>10.1109/TC.2024.3398426</doi><tpages>11</tpages><orcidid>https://orcid.org/0000-0003-1807-4502</orcidid><orcidid>https://orcid.org/0000-0002-4062-389X</orcidid></addata></record> |
fulltext | fulltext |
identifier | ISSN: 0018-9340 |
ispartof | IEEE transactions on computers, 2024-08, Vol.73 (8), p.1913-1923 |
issn | 0018-9340 1557-9956 |
language | eng |
recordid | cdi_crossref_primary_10_1109_TC_2024_3398426 |
source | IEEE Electronic Library (IEL) Journals |
subjects | Bloom filters Computers Filters Hardware hash function uniformity and independence Hash functions Memory management Testing universal hash functions vector subspaces in GF Vectors |
title | Uniformity and Independence of H3 Hash Functions for Bloom Filters |
url | http://sfxeu10.hosted.exlibrisgroup.com/loughborough?ctx_ver=Z39.88-2004&ctx_enc=info:ofi/enc:UTF-8&ctx_tim=2025-02-06T15%3A43%3A30IST&url_ver=Z39.88-2004&url_ctx_fmt=infofi/fmt:kev:mtx:ctx&rfr_id=info:sid/primo.exlibrisgroup.com:primo3-Article-proquest_cross&rft_val_fmt=info:ofi/fmt:kev:mtx:journal&rft.genre=article&rft.atitle=Uniformity%20and%20Independence%20of%20H3%20Hash%20Functions%20for%20Bloom%20Filters&rft.jtitle=IEEE%20transactions%20on%20computers&rft.au=Koltuk,%20Furkan&rft.date=2024-08-01&rft.volume=73&rft.issue=8&rft.spage=1913&rft.epage=1923&rft.pages=1913-1923&rft.issn=0018-9340&rft.eissn=1557-9956&rft.coden=ITCOB4&rft_id=info:doi/10.1109/TC.2024.3398426&rft_dat=%3Cproquest_cross%3E3078087045%3C/proquest_cross%3E%3Cgrp_id%3Ecdi_FETCH-LOGICAL-c244t-c845fe82ad7717e6f483eec0893abe9936da7792c873f7b4084bc202951d6afa3%3C/grp_id%3E%3Coa%3E%3C/oa%3E%3Curl%3E%3C/url%3E&rft_id=info:oai/&rft_pqid=3078087045&rft_id=info:pmid/&rft_ieee_id=10527416&rfr_iscdi=true |