Loading…

Efficient and Versatile FPGA Acceleration of Support Counting for Stream Mining of Sequences and Frequent Itemsets

Stream processing has become extremely popular for analyzing huge volumes of data for a variety of applications, including IoT, social networks, retail, and software logs analysis. Streams of data are produced continuously and are mined to extract patterns characterizing the data. A class of data mi...

Full description

Saved in:
Bibliographic Details
Published in:ACM transactions on reconfigurable technology and systems 2017-09, Vol.10 (3), p.1-25
Main Authors: Prost-Boucle, Adrien, Pétrot, FRédéric, Leroy, Vincent, Alemdar, Hande
Format: Article
Language:English
Subjects:
Citations: Items that this one cites
Items that cite this one
Online Access:Get full text
Tags: Add Tag
No Tags, Be the first to tag this record!
cited_by cdi_FETCH-LOGICAL-c292t-ac741e14dab27bdc43c689a75fe93b0d6eaacc4b790f0490d4cf75e367349a5f3
cites cdi_FETCH-LOGICAL-c292t-ac741e14dab27bdc43c689a75fe93b0d6eaacc4b790f0490d4cf75e367349a5f3
container_end_page 25
container_issue 3
container_start_page 1
container_title ACM transactions on reconfigurable technology and systems
container_volume 10
creator Prost-Boucle, Adrien
Pétrot, FRédéric
Leroy, Vincent
Alemdar, Hande
description Stream processing has become extremely popular for analyzing huge volumes of data for a variety of applications, including IoT, social networks, retail, and software logs analysis. Streams of data are produced continuously and are mined to extract patterns characterizing the data. A class of data mining algorithm, called generate-and-test , produces a set of candidate patterns that are then evaluated over data. The main challenges of these algorithms are to achieve high throughput, low latency, and reduced power consumption. In this article, we present a novel power-efficient, fast, and versatile hardware architecture whose objective is to monitor a set of target patterns to maintain their frequency over a stream of data. This accelerator can be used to accelerate data-mining algorithms, including itemsets and sequences mining. The massive fine-grain reconfiguration capability of field-programmable gate array (FPGA) technologies is ideal to implement the high number of pattern-detection units needed for these intensive data-mining applications. We have thus designed and implemented an IP that features high-density FPGA occupation and high working frequency. We provide detailed description of the IP internal micro-architecture and its actual implementation and optimization for the targeted FPGA resources. We validate our architecture by developing a co-designed implementation of the Apriori Frequent Itemset Mining (FIM) algorithm, and perform numerous experiments against existing hardware and software solutions. We demonstrate that FIM hardware acceleration is particularly efficient for large and low-density datasets (i.e., long-tailed datasets). Our IP reaches a data throughput of 250 million items/s and monitors up to 11.6k patterns simultaneously, on a prototyping board that overall consumes 24W in the worst case. Furthermore, our hardware accelerator remains generic and can be integrated to other generate and test algorithms.
doi_str_mv 10.1145/3027485
format article
fullrecord <record><control><sourceid>hal_cross</sourceid><recordid>TN_cdi_hal_primary_oai_HAL_hal_01474234v1</recordid><sourceformat>XML</sourceformat><sourcesystem>PC</sourcesystem><sourcerecordid>oai_HAL_hal_01474234v1</sourcerecordid><originalsourceid>FETCH-LOGICAL-c292t-ac741e14dab27bdc43c689a75fe93b0d6eaacc4b790f0490d4cf75e367349a5f3</originalsourceid><addsrcrecordid>eNo9kE9LAzEQxYMoWKv4FXITD6vJJps0x6X0H1QUql6XbHaiK9ukJqnQb2-3LT3NzI83j8dD6J6SJ0p58cxILvmouEADqpjIJKf88rwTcY1uYvwhRDAx4gMUJta2pgWXsHYN_oQQdWo7wNO3WYlLY6CDsCfeYW_xarvZ-JDw2G9dat0Xtj7gVQqg1_ildT3pVfC7BWcgHiyn4XAmvEiwjpDiLbqyuotwd5pD9DGdvI_n2fJ1thiXy8zkKk-ZNvvoQHmj61zWjeHMiJHSsrCgWE0aAVobw2upiCVckYYbKwtgQjKudGHZED0efb91V21Cu9ZhV3ndVvNyWfWMUC55zvgf3WsfjloTfIwB7PmBkqqvtTrVyv4B-tJqPQ</addsrcrecordid><sourcetype>Open Access Repository</sourcetype><iscdi>true</iscdi><recordtype>article</recordtype></control><display><type>article</type><title>Efficient and Versatile FPGA Acceleration of Support Counting for Stream Mining of Sequences and Frequent Itemsets</title><source>Association for Computing Machinery:Jisc Collections:ACM OPEN Journals 2023-2025 (reading list)</source><creator>Prost-Boucle, Adrien ; Pétrot, FRédéric ; Leroy, Vincent ; Alemdar, Hande</creator><creatorcontrib>Prost-Boucle, Adrien ; Pétrot, FRédéric ; Leroy, Vincent ; Alemdar, Hande</creatorcontrib><description>Stream processing has become extremely popular for analyzing huge volumes of data for a variety of applications, including IoT, social networks, retail, and software logs analysis. Streams of data are produced continuously and are mined to extract patterns characterizing the data. A class of data mining algorithm, called generate-and-test , produces a set of candidate patterns that are then evaluated over data. The main challenges of these algorithms are to achieve high throughput, low latency, and reduced power consumption. In this article, we present a novel power-efficient, fast, and versatile hardware architecture whose objective is to monitor a set of target patterns to maintain their frequency over a stream of data. This accelerator can be used to accelerate data-mining algorithms, including itemsets and sequences mining. The massive fine-grain reconfiguration capability of field-programmable gate array (FPGA) technologies is ideal to implement the high number of pattern-detection units needed for these intensive data-mining applications. We have thus designed and implemented an IP that features high-density FPGA occupation and high working frequency. We provide detailed description of the IP internal micro-architecture and its actual implementation and optimization for the targeted FPGA resources. We validate our architecture by developing a co-designed implementation of the Apriori Frequent Itemset Mining (FIM) algorithm, and perform numerous experiments against existing hardware and software solutions. We demonstrate that FIM hardware acceleration is particularly efficient for large and low-density datasets (i.e., long-tailed datasets). Our IP reaches a data throughput of 250 million items/s and monitors up to 11.6k patterns simultaneously, on a prototyping board that overall consumes 24W in the worst case. Furthermore, our hardware accelerator remains generic and can be integrated to other generate and test algorithms.</description><identifier>ISSN: 1936-7406</identifier><identifier>EISSN: 1936-7414</identifier><identifier>DOI: 10.1145/3027485</identifier><language>eng</language><publisher>ACM</publisher><subject>Computer Science ; Hardware Architecture ; Information Retrieval</subject><ispartof>ACM transactions on reconfigurable technology and systems, 2017-09, Vol.10 (3), p.1-25</ispartof><rights>Attribution - NoDerivatives</rights><lds50>peer_reviewed</lds50><oa>free_for_read</oa><woscitedreferencessubscribed>false</woscitedreferencessubscribed><citedby>FETCH-LOGICAL-c292t-ac741e14dab27bdc43c689a75fe93b0d6eaacc4b790f0490d4cf75e367349a5f3</citedby><cites>FETCH-LOGICAL-c292t-ac741e14dab27bdc43c689a75fe93b0d6eaacc4b790f0490d4cf75e367349a5f3</cites><orcidid>0000-0001-5372-4334 ; 0000-0003-3464-726X ; 0000-0003-0624-7373</orcidid></display><links><openurl>$$Topenurl_article</openurl><openurlfulltext>$$Topenurlfull_article</openurlfulltext><thumbnail>$$Tsyndetics_thumb_exl</thumbnail><link.rule.ids>230,314,780,784,885,27924,27925</link.rule.ids><backlink>$$Uhttps://hal.science/hal-01474234$$DView record in HAL$$Hfree_for_read</backlink></links><search><creatorcontrib>Prost-Boucle, Adrien</creatorcontrib><creatorcontrib>Pétrot, FRédéric</creatorcontrib><creatorcontrib>Leroy, Vincent</creatorcontrib><creatorcontrib>Alemdar, Hande</creatorcontrib><title>Efficient and Versatile FPGA Acceleration of Support Counting for Stream Mining of Sequences and Frequent Itemsets</title><title>ACM transactions on reconfigurable technology and systems</title><description>Stream processing has become extremely popular for analyzing huge volumes of data for a variety of applications, including IoT, social networks, retail, and software logs analysis. Streams of data are produced continuously and are mined to extract patterns characterizing the data. A class of data mining algorithm, called generate-and-test , produces a set of candidate patterns that are then evaluated over data. The main challenges of these algorithms are to achieve high throughput, low latency, and reduced power consumption. In this article, we present a novel power-efficient, fast, and versatile hardware architecture whose objective is to monitor a set of target patterns to maintain their frequency over a stream of data. This accelerator can be used to accelerate data-mining algorithms, including itemsets and sequences mining. The massive fine-grain reconfiguration capability of field-programmable gate array (FPGA) technologies is ideal to implement the high number of pattern-detection units needed for these intensive data-mining applications. We have thus designed and implemented an IP that features high-density FPGA occupation and high working frequency. We provide detailed description of the IP internal micro-architecture and its actual implementation and optimization for the targeted FPGA resources. We validate our architecture by developing a co-designed implementation of the Apriori Frequent Itemset Mining (FIM) algorithm, and perform numerous experiments against existing hardware and software solutions. We demonstrate that FIM hardware acceleration is particularly efficient for large and low-density datasets (i.e., long-tailed datasets). Our IP reaches a data throughput of 250 million items/s and monitors up to 11.6k patterns simultaneously, on a prototyping board that overall consumes 24W in the worst case. Furthermore, our hardware accelerator remains generic and can be integrated to other generate and test algorithms.</description><subject>Computer Science</subject><subject>Hardware Architecture</subject><subject>Information Retrieval</subject><issn>1936-7406</issn><issn>1936-7414</issn><fulltext>true</fulltext><rsrctype>article</rsrctype><creationdate>2017</creationdate><recordtype>article</recordtype><recordid>eNo9kE9LAzEQxYMoWKv4FXITD6vJJps0x6X0H1QUql6XbHaiK9ukJqnQb2-3LT3NzI83j8dD6J6SJ0p58cxILvmouEADqpjIJKf88rwTcY1uYvwhRDAx4gMUJta2pgWXsHYN_oQQdWo7wNO3WYlLY6CDsCfeYW_xarvZ-JDw2G9dat0Xtj7gVQqg1_ildT3pVfC7BWcgHiyn4XAmvEiwjpDiLbqyuotwd5pD9DGdvI_n2fJ1thiXy8zkKk-ZNvvoQHmj61zWjeHMiJHSsrCgWE0aAVobw2upiCVckYYbKwtgQjKudGHZED0efb91V21Cu9ZhV3ndVvNyWfWMUC55zvgf3WsfjloTfIwB7PmBkqqvtTrVyv4B-tJqPQ</recordid><startdate>20170930</startdate><enddate>20170930</enddate><creator>Prost-Boucle, Adrien</creator><creator>Pétrot, FRédéric</creator><creator>Leroy, Vincent</creator><creator>Alemdar, Hande</creator><general>ACM</general><scope>AAYXX</scope><scope>CITATION</scope><scope>1XC</scope><scope>VOOES</scope><orcidid>https://orcid.org/0000-0001-5372-4334</orcidid><orcidid>https://orcid.org/0000-0003-3464-726X</orcidid><orcidid>https://orcid.org/0000-0003-0624-7373</orcidid></search><sort><creationdate>20170930</creationdate><title>Efficient and Versatile FPGA Acceleration of Support Counting for Stream Mining of Sequences and Frequent Itemsets</title><author>Prost-Boucle, Adrien ; Pétrot, FRédéric ; Leroy, Vincent ; Alemdar, Hande</author></sort><facets><frbrtype>5</frbrtype><frbrgroupid>cdi_FETCH-LOGICAL-c292t-ac741e14dab27bdc43c689a75fe93b0d6eaacc4b790f0490d4cf75e367349a5f3</frbrgroupid><rsrctype>articles</rsrctype><prefilter>articles</prefilter><language>eng</language><creationdate>2017</creationdate><topic>Computer Science</topic><topic>Hardware Architecture</topic><topic>Information Retrieval</topic><toplevel>peer_reviewed</toplevel><toplevel>online_resources</toplevel><creatorcontrib>Prost-Boucle, Adrien</creatorcontrib><creatorcontrib>Pétrot, FRédéric</creatorcontrib><creatorcontrib>Leroy, Vincent</creatorcontrib><creatorcontrib>Alemdar, Hande</creatorcontrib><collection>CrossRef</collection><collection>Hyper Article en Ligne (HAL)</collection><collection>Hyper Article en Ligne (HAL) (Open Access)</collection><jtitle>ACM transactions on reconfigurable technology and systems</jtitle></facets><delivery><delcategory>Remote Search Resource</delcategory><fulltext>fulltext</fulltext></delivery><addata><au>Prost-Boucle, Adrien</au><au>Pétrot, FRédéric</au><au>Leroy, Vincent</au><au>Alemdar, Hande</au><format>journal</format><genre>article</genre><ristype>JOUR</ristype><atitle>Efficient and Versatile FPGA Acceleration of Support Counting for Stream Mining of Sequences and Frequent Itemsets</atitle><jtitle>ACM transactions on reconfigurable technology and systems</jtitle><date>2017-09-30</date><risdate>2017</risdate><volume>10</volume><issue>3</issue><spage>1</spage><epage>25</epage><pages>1-25</pages><issn>1936-7406</issn><eissn>1936-7414</eissn><abstract>Stream processing has become extremely popular for analyzing huge volumes of data for a variety of applications, including IoT, social networks, retail, and software logs analysis. Streams of data are produced continuously and are mined to extract patterns characterizing the data. A class of data mining algorithm, called generate-and-test , produces a set of candidate patterns that are then evaluated over data. The main challenges of these algorithms are to achieve high throughput, low latency, and reduced power consumption. In this article, we present a novel power-efficient, fast, and versatile hardware architecture whose objective is to monitor a set of target patterns to maintain their frequency over a stream of data. This accelerator can be used to accelerate data-mining algorithms, including itemsets and sequences mining. The massive fine-grain reconfiguration capability of field-programmable gate array (FPGA) technologies is ideal to implement the high number of pattern-detection units needed for these intensive data-mining applications. We have thus designed and implemented an IP that features high-density FPGA occupation and high working frequency. We provide detailed description of the IP internal micro-architecture and its actual implementation and optimization for the targeted FPGA resources. We validate our architecture by developing a co-designed implementation of the Apriori Frequent Itemset Mining (FIM) algorithm, and perform numerous experiments against existing hardware and software solutions. We demonstrate that FIM hardware acceleration is particularly efficient for large and low-density datasets (i.e., long-tailed datasets). Our IP reaches a data throughput of 250 million items/s and monitors up to 11.6k patterns simultaneously, on a prototyping board that overall consumes 24W in the worst case. Furthermore, our hardware accelerator remains generic and can be integrated to other generate and test algorithms.</abstract><pub>ACM</pub><doi>10.1145/3027485</doi><tpages>25</tpages><orcidid>https://orcid.org/0000-0001-5372-4334</orcidid><orcidid>https://orcid.org/0000-0003-3464-726X</orcidid><orcidid>https://orcid.org/0000-0003-0624-7373</orcidid><oa>free_for_read</oa></addata></record>
fulltext fulltext
identifier ISSN: 1936-7406
ispartof ACM transactions on reconfigurable technology and systems, 2017-09, Vol.10 (3), p.1-25
issn 1936-7406
1936-7414
language eng
recordid cdi_hal_primary_oai_HAL_hal_01474234v1
source Association for Computing Machinery:Jisc Collections:ACM OPEN Journals 2023-2025 (reading list)
subjects Computer Science
Hardware Architecture
Information Retrieval
title Efficient and Versatile FPGA Acceleration of Support Counting for Stream Mining of Sequences and Frequent Itemsets
url http://sfxeu10.hosted.exlibrisgroup.com/loughborough?ctx_ver=Z39.88-2004&ctx_enc=info:ofi/enc:UTF-8&ctx_tim=2025-01-01T08%3A16%3A39IST&url_ver=Z39.88-2004&url_ctx_fmt=infofi/fmt:kev:mtx:ctx&rfr_id=info:sid/primo.exlibrisgroup.com:primo3-Article-hal_cross&rft_val_fmt=info:ofi/fmt:kev:mtx:journal&rft.genre=article&rft.atitle=Efficient%20and%20Versatile%20FPGA%20Acceleration%20of%20Support%20Counting%20for%20Stream%20Mining%20of%20Sequences%20and%20Frequent%20Itemsets&rft.jtitle=ACM%20transactions%20on%20reconfigurable%20technology%20and%20systems&rft.au=Prost-Boucle,%20Adrien&rft.date=2017-09-30&rft.volume=10&rft.issue=3&rft.spage=1&rft.epage=25&rft.pages=1-25&rft.issn=1936-7406&rft.eissn=1936-7414&rft_id=info:doi/10.1145/3027485&rft_dat=%3Chal_cross%3Eoai_HAL_hal_01474234v1%3C/hal_cross%3E%3Cgrp_id%3Ecdi_FETCH-LOGICAL-c292t-ac741e14dab27bdc43c689a75fe93b0d6eaacc4b790f0490d4cf75e367349a5f3%3C/grp_id%3E%3Coa%3E%3C/oa%3E%3Curl%3E%3C/url%3E&rft_id=info:oai/&rft_id=info:pmid/&rfr_iscdi=true