Loading…
Frequent Itemset Mining in Big Data With Effective Single Scan Algorithms
This paper considers frequent itemsets mining in transactional databases. It introduces a new accurate single scan approach for frequent itemset mining (SSFIM), a heuristic as an alternative approach (EA-SSFIM), as well as a parallel implementation on Hadoop clusters (MR-SSFIM). EA-SSFIM and MR-SSFI...
Saved in:
Published in: | IEEE access 2018, Vol.6, p.68013-68026 |
---|---|
Main Authors: | , , , |
Format: | Article |
Language: | English |
Subjects: | |
Citations: | Items that this one cites Items that cite this one |
Online Access: | Get full text |
Tags: |
Add Tag
No Tags, Be the first to tag this record!
|
cited_by | cdi_FETCH-LOGICAL-c408t-a2f3f63e811903d278089c16453674903b43d071c5404a62cd177eec2885821d3 |
---|---|
cites | cdi_FETCH-LOGICAL-c408t-a2f3f63e811903d278089c16453674903b43d071c5404a62cd177eec2885821d3 |
container_end_page | 68026 |
container_issue | |
container_start_page | 68013 |
container_title | IEEE access |
container_volume | 6 |
creator | Djenouri, Youcef Djenouri, Djamel Lin, Jerry Chun-Wei Belhadi, Asma |
description | This paper considers frequent itemsets mining in transactional databases. It introduces a new accurate single scan approach for frequent itemset mining (SSFIM), a heuristic as an alternative approach (EA-SSFIM), as well as a parallel implementation on Hadoop clusters (MR-SSFIM). EA-SSFIM and MR-SSFIM target sparse and big databases, respectively. The proposed approach (in all its variants) requires only one scan to extract the candidate itemsets, and it has the advantage to generate a fixed number of candidate itemsets independently from the value of the minimum support. This accelerates the scan process compared with existing approaches while dealing with sparse and big databases. Numerical results show that SSFIM outperforms the state-of-the-art FIM approaches while dealing with medium and large databases. Moreover, EA-SSFIM provides similar performance as SSFIM while considerably reducing the runtime for large databases. The results also reveal the superiority of MR-SSFIM compared with the existing HPC-based solutions for FIM using sparse and big databases. |
doi_str_mv | 10.1109/ACCESS.2018.2880275 |
format | article |
fullrecord | <record><control><sourceid>proquest_ieee_</sourceid><recordid>TN_cdi_ieee_primary_8529189</recordid><sourceformat>XML</sourceformat><sourcesystem>PC</sourcesystem><ieee_id>8529189</ieee_id><doaj_id>oai_doaj_org_article_632ccdace2d3477dbacbe8718dca6a0c</doaj_id><sourcerecordid>2455929741</sourcerecordid><originalsourceid>FETCH-LOGICAL-c408t-a2f3f63e811903d278089c16453674903b43d071c5404a62cd177eec2885821d3</originalsourceid><addsrcrecordid>eNpNUV1PwjAUXYwmEuQX8NLEZ7BfW7tHnKAkGB_Q-NiU9m6WjA27YuK_tzBC7Mttzr3n3NOeJBkTPCUE5w-zopiv11OKiZxSKTEV6VUyoCTLJyxl2fW_-20y6rotjkdGKBWDZLnw8H2AJqBlgF0HAb26xjUVcg16dBV60kGjTxe-0LwswQT3A2gd-3UsRjdoVletj-1dd5fclLruYHSuw-RjMX8vXiart-dlMVtNDMcyTDQtWZkxkITkmFkqJJa5IRmP9gSP0IYziwUxKcdcZ9RYIgSAiS9LJSWWDaPnk65t9Vbtvdtp_6ta7dQJaH2ltA_O1KAyRo2x2gC1jAthN9psQAoirdGZxiZq3fdae9_GX-iC2rYH30T7ivI0zWkuOIlTrJ8yvu06D-VlK8HqGIHqI1DHCNQ5gsga9ywHABeGTGlOZM7-ACcUf_w</addsrcrecordid><sourcetype>Open Website</sourcetype><iscdi>true</iscdi><recordtype>article</recordtype><pqid>2455929741</pqid></control><display><type>article</type><title>Frequent Itemset Mining in Big Data With Effective Single Scan Algorithms</title><source>IEEE Xplore Open Access Journals</source><creator>Djenouri, Youcef ; Djenouri, Djamel ; Lin, Jerry Chun-Wei ; Belhadi, Asma</creator><creatorcontrib>Djenouri, Youcef ; Djenouri, Djamel ; Lin, Jerry Chun-Wei ; Belhadi, Asma</creatorcontrib><description>This paper considers frequent itemsets mining in transactional databases. It introduces a new accurate single scan approach for frequent itemset mining (SSFIM), a heuristic as an alternative approach (EA-SSFIM), as well as a parallel implementation on Hadoop clusters (MR-SSFIM). EA-SSFIM and MR-SSFIM target sparse and big databases, respectively. The proposed approach (in all its variants) requires only one scan to extract the candidate itemsets, and it has the advantage to generate a fixed number of candidate itemsets independently from the value of the minimum support. This accelerates the scan process compared with existing approaches while dealing with sparse and big databases. Numerical results show that SSFIM outperforms the state-of-the-art FIM approaches while dealing with medium and large databases. Moreover, EA-SSFIM provides similar performance as SSFIM while considerably reducing the runtime for large databases. The results also reveal the superiority of MR-SSFIM compared with the existing HPC-based solutions for FIM using sparse and big databases.</description><identifier>ISSN: 2169-3536</identifier><identifier>EISSN: 2169-3536</identifier><identifier>DOI: 10.1109/ACCESS.2018.2880275</identifier><identifier>CODEN: IAECCG</identifier><language>eng</language><publisher>Piscataway: IEEE</publisher><subject>Algorithms ; Apriori ; Big Data ; Clustering algorithms ; Computer science ; Data mining ; frequent itemset mining ; heuristic ; Itemsets ; parallel computing ; Runtime ; support computing</subject><ispartof>IEEE access, 2018, Vol.6, p.68013-68026</ispartof><rights>Copyright The Institute of Electrical and Electronics Engineers, Inc. (IEEE) 2018</rights><lds50>peer_reviewed</lds50><oa>free_for_read</oa><woscitedreferencessubscribed>false</woscitedreferencessubscribed><citedby>FETCH-LOGICAL-c408t-a2f3f63e811903d278089c16453674903b43d071c5404a62cd177eec2885821d3</citedby><cites>FETCH-LOGICAL-c408t-a2f3f63e811903d278089c16453674903b43d071c5404a62cd177eec2885821d3</cites><orcidid>0000-0001-8768-9709 ; 0000-0003-0135-7450</orcidid></display><links><openurl>$$Topenurl_article</openurl><openurlfulltext>$$Topenurlfull_article</openurlfulltext><thumbnail>$$Tsyndetics_thumb_exl</thumbnail><linktohtml>$$Uhttps://ieeexplore.ieee.org/document/8529189$$EHTML$$P50$$Gieee$$Hfree_for_read</linktohtml><link.rule.ids>314,780,784,4024,27633,27923,27924,27925,54933</link.rule.ids></links><search><creatorcontrib>Djenouri, Youcef</creatorcontrib><creatorcontrib>Djenouri, Djamel</creatorcontrib><creatorcontrib>Lin, Jerry Chun-Wei</creatorcontrib><creatorcontrib>Belhadi, Asma</creatorcontrib><title>Frequent Itemset Mining in Big Data With Effective Single Scan Algorithms</title><title>IEEE access</title><addtitle>Access</addtitle><description>This paper considers frequent itemsets mining in transactional databases. It introduces a new accurate single scan approach for frequent itemset mining (SSFIM), a heuristic as an alternative approach (EA-SSFIM), as well as a parallel implementation on Hadoop clusters (MR-SSFIM). EA-SSFIM and MR-SSFIM target sparse and big databases, respectively. The proposed approach (in all its variants) requires only one scan to extract the candidate itemsets, and it has the advantage to generate a fixed number of candidate itemsets independently from the value of the minimum support. This accelerates the scan process compared with existing approaches while dealing with sparse and big databases. Numerical results show that SSFIM outperforms the state-of-the-art FIM approaches while dealing with medium and large databases. Moreover, EA-SSFIM provides similar performance as SSFIM while considerably reducing the runtime for large databases. The results also reveal the superiority of MR-SSFIM compared with the existing HPC-based solutions for FIM using sparse and big databases.</description><subject>Algorithms</subject><subject>Apriori</subject><subject>Big Data</subject><subject>Clustering algorithms</subject><subject>Computer science</subject><subject>Data mining</subject><subject>frequent itemset mining</subject><subject>heuristic</subject><subject>Itemsets</subject><subject>parallel computing</subject><subject>Runtime</subject><subject>support computing</subject><issn>2169-3536</issn><issn>2169-3536</issn><fulltext>true</fulltext><rsrctype>article</rsrctype><creationdate>2018</creationdate><recordtype>article</recordtype><sourceid>ESBDL</sourceid><sourceid>DOA</sourceid><recordid>eNpNUV1PwjAUXYwmEuQX8NLEZ7BfW7tHnKAkGB_Q-NiU9m6WjA27YuK_tzBC7Mttzr3n3NOeJBkTPCUE5w-zopiv11OKiZxSKTEV6VUyoCTLJyxl2fW_-20y6rotjkdGKBWDZLnw8H2AJqBlgF0HAb26xjUVcg16dBV60kGjTxe-0LwswQT3A2gd-3UsRjdoVletj-1dd5fclLruYHSuw-RjMX8vXiart-dlMVtNDMcyTDQtWZkxkITkmFkqJJa5IRmP9gSP0IYziwUxKcdcZ9RYIgSAiS9LJSWWDaPnk65t9Vbtvdtp_6ta7dQJaH2ltA_O1KAyRo2x2gC1jAthN9psQAoirdGZxiZq3fdae9_GX-iC2rYH30T7ivI0zWkuOIlTrJ8yvu06D-VlK8HqGIHqI1DHCNQ5gsga9ywHABeGTGlOZM7-ACcUf_w</recordid><startdate>2018</startdate><enddate>2018</enddate><creator>Djenouri, Youcef</creator><creator>Djenouri, Djamel</creator><creator>Lin, Jerry Chun-Wei</creator><creator>Belhadi, Asma</creator><general>IEEE</general><general>The Institute of Electrical and Electronics Engineers, Inc. (IEEE)</general><scope>97E</scope><scope>ESBDL</scope><scope>RIA</scope><scope>RIE</scope><scope>AAYXX</scope><scope>CITATION</scope><scope>7SC</scope><scope>7SP</scope><scope>7SR</scope><scope>8BQ</scope><scope>8FD</scope><scope>JG9</scope><scope>JQ2</scope><scope>L7M</scope><scope>L~C</scope><scope>L~D</scope><scope>DOA</scope><orcidid>https://orcid.org/0000-0001-8768-9709</orcidid><orcidid>https://orcid.org/0000-0003-0135-7450</orcidid></search><sort><creationdate>2018</creationdate><title>Frequent Itemset Mining in Big Data With Effective Single Scan Algorithms</title><author>Djenouri, Youcef ; Djenouri, Djamel ; Lin, Jerry Chun-Wei ; Belhadi, Asma</author></sort><facets><frbrtype>5</frbrtype><frbrgroupid>cdi_FETCH-LOGICAL-c408t-a2f3f63e811903d278089c16453674903b43d071c5404a62cd177eec2885821d3</frbrgroupid><rsrctype>articles</rsrctype><prefilter>articles</prefilter><language>eng</language><creationdate>2018</creationdate><topic>Algorithms</topic><topic>Apriori</topic><topic>Big Data</topic><topic>Clustering algorithms</topic><topic>Computer science</topic><topic>Data mining</topic><topic>frequent itemset mining</topic><topic>heuristic</topic><topic>Itemsets</topic><topic>parallel computing</topic><topic>Runtime</topic><topic>support computing</topic><toplevel>peer_reviewed</toplevel><toplevel>online_resources</toplevel><creatorcontrib>Djenouri, Youcef</creatorcontrib><creatorcontrib>Djenouri, Djamel</creatorcontrib><creatorcontrib>Lin, Jerry Chun-Wei</creatorcontrib><creatorcontrib>Belhadi, Asma</creatorcontrib><collection>IEEE All-Society Periodicals Package (ASPP) 2005-present</collection><collection>IEEE Xplore Open Access Journals</collection><collection>IEEE All-Society Periodicals Package (ASPP) 1998-Present</collection><collection>IEEE Xplore</collection><collection>CrossRef</collection><collection>Computer and Information Systems Abstracts</collection><collection>Electronics & Communications Abstracts</collection><collection>Engineered Materials Abstracts</collection><collection>METADEX</collection><collection>Technology Research Database</collection><collection>Materials Research Database</collection><collection>ProQuest Computer Science Collection</collection><collection>Advanced Technologies Database with Aerospace</collection><collection>Computer and Information Systems Abstracts – Academic</collection><collection>Computer and Information Systems Abstracts Professional</collection><collection>DOAJ Directory of Open Access Journals</collection><jtitle>IEEE access</jtitle></facets><delivery><delcategory>Remote Search Resource</delcategory><fulltext>fulltext</fulltext></delivery><addata><au>Djenouri, Youcef</au><au>Djenouri, Djamel</au><au>Lin, Jerry Chun-Wei</au><au>Belhadi, Asma</au><format>journal</format><genre>article</genre><ristype>JOUR</ristype><atitle>Frequent Itemset Mining in Big Data With Effective Single Scan Algorithms</atitle><jtitle>IEEE access</jtitle><stitle>Access</stitle><date>2018</date><risdate>2018</risdate><volume>6</volume><spage>68013</spage><epage>68026</epage><pages>68013-68026</pages><issn>2169-3536</issn><eissn>2169-3536</eissn><coden>IAECCG</coden><abstract>This paper considers frequent itemsets mining in transactional databases. It introduces a new accurate single scan approach for frequent itemset mining (SSFIM), a heuristic as an alternative approach (EA-SSFIM), as well as a parallel implementation on Hadoop clusters (MR-SSFIM). EA-SSFIM and MR-SSFIM target sparse and big databases, respectively. The proposed approach (in all its variants) requires only one scan to extract the candidate itemsets, and it has the advantage to generate a fixed number of candidate itemsets independently from the value of the minimum support. This accelerates the scan process compared with existing approaches while dealing with sparse and big databases. Numerical results show that SSFIM outperforms the state-of-the-art FIM approaches while dealing with medium and large databases. Moreover, EA-SSFIM provides similar performance as SSFIM while considerably reducing the runtime for large databases. The results also reveal the superiority of MR-SSFIM compared with the existing HPC-based solutions for FIM using sparse and big databases.</abstract><cop>Piscataway</cop><pub>IEEE</pub><doi>10.1109/ACCESS.2018.2880275</doi><tpages>14</tpages><orcidid>https://orcid.org/0000-0001-8768-9709</orcidid><orcidid>https://orcid.org/0000-0003-0135-7450</orcidid><oa>free_for_read</oa></addata></record> |
fulltext | fulltext |
identifier | ISSN: 2169-3536 |
ispartof | IEEE access, 2018, Vol.6, p.68013-68026 |
issn | 2169-3536 2169-3536 |
language | eng |
recordid | cdi_ieee_primary_8529189 |
source | IEEE Xplore Open Access Journals |
subjects | Algorithms Apriori Big Data Clustering algorithms Computer science Data mining frequent itemset mining heuristic Itemsets parallel computing Runtime support computing |
title | Frequent Itemset Mining in Big Data With Effective Single Scan Algorithms |
url | http://sfxeu10.hosted.exlibrisgroup.com/loughborough?ctx_ver=Z39.88-2004&ctx_enc=info:ofi/enc:UTF-8&ctx_tim=2024-12-27T01%3A31%3A43IST&url_ver=Z39.88-2004&url_ctx_fmt=infofi/fmt:kev:mtx:ctx&rfr_id=info:sid/primo.exlibrisgroup.com:primo3-Article-proquest_ieee_&rft_val_fmt=info:ofi/fmt:kev:mtx:journal&rft.genre=article&rft.atitle=Frequent%20Itemset%20Mining%20in%20Big%20Data%20With%20Effective%20Single%20Scan%20Algorithms&rft.jtitle=IEEE%20access&rft.au=Djenouri,%20Youcef&rft.date=2018&rft.volume=6&rft.spage=68013&rft.epage=68026&rft.pages=68013-68026&rft.issn=2169-3536&rft.eissn=2169-3536&rft.coden=IAECCG&rft_id=info:doi/10.1109/ACCESS.2018.2880275&rft_dat=%3Cproquest_ieee_%3E2455929741%3C/proquest_ieee_%3E%3Cgrp_id%3Ecdi_FETCH-LOGICAL-c408t-a2f3f63e811903d278089c16453674903b43d071c5404a62cd177eec2885821d3%3C/grp_id%3E%3Coa%3E%3C/oa%3E%3Curl%3E%3C/url%3E&rft_id=info:oai/&rft_pqid=2455929741&rft_id=info:pmid/&rft_ieee_id=8529189&rfr_iscdi=true |