Loading…

An Efficient Sequential Pattern Mining Algorithm Based on the 2-Sequence Matrix

Sequential pattern mining has become more and more popular in recent years due to its wide applications and the fact that it can find more information than association rules. Two famous algorithms in sequential pattern mining are AprioriAll and PrefixSpan. These two algorithms not only need to scan...

Full description

Saved in:
Bibliographic Details
Main Authors: Chia-Ying Hsieh, Don-Lin Yang, Jungpin Wu
Format: Conference Proceeding
Language:English
Subjects:
Online Access:Request full text
Tags: Add Tag
No Tags, Be the first to tag this record!
cited_by
cites
container_end_page 591
container_issue
container_start_page 583
container_title
container_volume
creator Chia-Ying Hsieh
Don-Lin Yang
Jungpin Wu
description Sequential pattern mining has become more and more popular in recent years due to its wide applications and the fact that it can find more information than association rules. Two famous algorithms in sequential pattern mining are AprioriAll and PrefixSpan. These two algorithms not only need to scan a database or projected-databases many times, but also require setting a minimal support threshold to prune infrequent data to obtain useful sequential patterns efficiently. In addition, they must rescan the database if new items or sequences are added. In this paper, we propose a novel algorithm called efficient sequential pattern enumeration (ESPE) to solve the above problems. In addition, our method can be applied in many applications, such as for the itemsets appearing at the same time in a sequence. In our experiments, we show that the performance of ESPE is better than the other two methods using various datasets.
doi_str_mv 10.1109/ICDMW.2008.82
format conference_proceeding
fullrecord <record><control><sourceid>ieee_CHZPO</sourceid><recordid>TN_cdi_ieee_primary_4733982</recordid><sourceformat>XML</sourceformat><sourcesystem>PC</sourcesystem><ieee_id>4733982</ieee_id><sourcerecordid>4733982</sourcerecordid><originalsourceid>FETCH-LOGICAL-i90t-10553b8efe3f1604c5e45bc62a15bbe495a997c873398d4f98950027c33b11e73</originalsourceid><addsrcrecordid>eNo9jMFOwzAQBQ0CiVJ65MTFP5Cw9saJ9xhCgUqNikQljpWTblqjNIXESOXvQRRxmneYN0JcK4iVArqdFffla6wBbGz1ibiELCWDBtCeipHGzESkDZ39b9QXYjIMbwCgCBMiPRKLvJPTpvG15y7IF_74_KF3rXx2IXDfydJ3vtvIvN3sex-2O3nnBl7LfSfDlqWOjpeaZelC7w9X4rxx7cCTP47F8mG6LJ6i-eJxVuTzyBOESIExWFluGBuVQlIbTkxVp9opU1WckHFEWW0zRLLrpCFLBkBnNWKlFGc4FjfHrGfm1Xvvd67_WiW_vsZvYYJOUQ</addsrcrecordid><sourcetype>Publisher</sourcetype><iscdi>true</iscdi><recordtype>conference_proceeding</recordtype></control><display><type>conference_proceeding</type><title>An Efficient Sequential Pattern Mining Algorithm Based on the 2-Sequence Matrix</title><source>IEEE Xplore All Conference Series</source><creator>Chia-Ying Hsieh ; Don-Lin Yang ; Jungpin Wu</creator><creatorcontrib>Chia-Ying Hsieh ; Don-Lin Yang ; Jungpin Wu</creatorcontrib><description>Sequential pattern mining has become more and more popular in recent years due to its wide applications and the fact that it can find more information than association rules. Two famous algorithms in sequential pattern mining are AprioriAll and PrefixSpan. These two algorithms not only need to scan a database or projected-databases many times, but also require setting a minimal support threshold to prune infrequent data to obtain useful sequential patterns efficiently. In addition, they must rescan the database if new items or sequences are added. In this paper, we propose a novel algorithm called efficient sequential pattern enumeration (ESPE) to solve the above problems. In addition, our method can be applied in many applications, such as for the itemsets appearing at the same time in a sequence. In our experiments, we show that the performance of ESPE is better than the other two methods using various datasets.</description><identifier>ISSN: 2375-9232</identifier><identifier>EISSN: 2375-9259</identifier><identifier>EISBN: 0769535038</identifier><identifier>EISBN: 9780769535036</identifier><identifier>DOI: 10.1109/ICDMW.2008.82</identifier><language>eng</language><publisher>IEEE</publisher><subject>Application software ; association rule ; Association rules ; Bioinformatics ; candidate enumeration ; Computer science ; Conferences ; Data engineering ; Data mining ; Itemsets ; minimum support ; Sequential pattern ; Statistics</subject><ispartof>2008 IEEE International Conference on Data Mining Workshops, 2008, p.583-591</ispartof><woscitedreferencessubscribed>false</woscitedreferencessubscribed></display><links><openurl>$$Topenurl_article</openurl><openurlfulltext>$$Topenurlfull_article</openurlfulltext><thumbnail>$$Tsyndetics_thumb_exl</thumbnail><linktohtml>$$Uhttps://ieeexplore.ieee.org/document/4733982$$EHTML$$P50$$Gieee$$H</linktohtml><link.rule.ids>309,310,780,784,789,790,23930,23931,25140,27925,54555,54932</link.rule.ids><linktorsrc>$$Uhttps://ieeexplore.ieee.org/document/4733982$$EView_record_in_IEEE$$FView_record_in_$$GIEEE</linktorsrc></links><search><creatorcontrib>Chia-Ying Hsieh</creatorcontrib><creatorcontrib>Don-Lin Yang</creatorcontrib><creatorcontrib>Jungpin Wu</creatorcontrib><title>An Efficient Sequential Pattern Mining Algorithm Based on the 2-Sequence Matrix</title><title>2008 IEEE International Conference on Data Mining Workshops</title><addtitle>ICDMW</addtitle><description>Sequential pattern mining has become more and more popular in recent years due to its wide applications and the fact that it can find more information than association rules. Two famous algorithms in sequential pattern mining are AprioriAll and PrefixSpan. These two algorithms not only need to scan a database or projected-databases many times, but also require setting a minimal support threshold to prune infrequent data to obtain useful sequential patterns efficiently. In addition, they must rescan the database if new items or sequences are added. In this paper, we propose a novel algorithm called efficient sequential pattern enumeration (ESPE) to solve the above problems. In addition, our method can be applied in many applications, such as for the itemsets appearing at the same time in a sequence. In our experiments, we show that the performance of ESPE is better than the other two methods using various datasets.</description><subject>Application software</subject><subject>association rule</subject><subject>Association rules</subject><subject>Bioinformatics</subject><subject>candidate enumeration</subject><subject>Computer science</subject><subject>Conferences</subject><subject>Data engineering</subject><subject>Data mining</subject><subject>Itemsets</subject><subject>minimum support</subject><subject>Sequential pattern</subject><subject>Statistics</subject><issn>2375-9232</issn><issn>2375-9259</issn><isbn>0769535038</isbn><isbn>9780769535036</isbn><fulltext>true</fulltext><rsrctype>conference_proceeding</rsrctype><creationdate>2008</creationdate><recordtype>conference_proceeding</recordtype><sourceid>6IE</sourceid><recordid>eNo9jMFOwzAQBQ0CiVJ65MTFP5Cw9saJ9xhCgUqNikQljpWTblqjNIXESOXvQRRxmneYN0JcK4iVArqdFffla6wBbGz1ibiELCWDBtCeipHGzESkDZ39b9QXYjIMbwCgCBMiPRKLvJPTpvG15y7IF_74_KF3rXx2IXDfydJ3vtvIvN3sex-2O3nnBl7LfSfDlqWOjpeaZelC7w9X4rxx7cCTP47F8mG6LJ6i-eJxVuTzyBOESIExWFluGBuVQlIbTkxVp9opU1WckHFEWW0zRLLrpCFLBkBnNWKlFGc4FjfHrGfm1Xvvd67_WiW_vsZvYYJOUQ</recordid><startdate>200812</startdate><enddate>200812</enddate><creator>Chia-Ying Hsieh</creator><creator>Don-Lin Yang</creator><creator>Jungpin Wu</creator><general>IEEE</general><scope>6IE</scope><scope>6IL</scope><scope>CBEJK</scope><scope>RIE</scope><scope>RIL</scope></search><sort><creationdate>200812</creationdate><title>An Efficient Sequential Pattern Mining Algorithm Based on the 2-Sequence Matrix</title><author>Chia-Ying Hsieh ; Don-Lin Yang ; Jungpin Wu</author></sort><facets><frbrtype>5</frbrtype><frbrgroupid>cdi_FETCH-LOGICAL-i90t-10553b8efe3f1604c5e45bc62a15bbe495a997c873398d4f98950027c33b11e73</frbrgroupid><rsrctype>conference_proceedings</rsrctype><prefilter>conference_proceedings</prefilter><language>eng</language><creationdate>2008</creationdate><topic>Application software</topic><topic>association rule</topic><topic>Association rules</topic><topic>Bioinformatics</topic><topic>candidate enumeration</topic><topic>Computer science</topic><topic>Conferences</topic><topic>Data engineering</topic><topic>Data mining</topic><topic>Itemsets</topic><topic>minimum support</topic><topic>Sequential pattern</topic><topic>Statistics</topic><toplevel>online_resources</toplevel><creatorcontrib>Chia-Ying Hsieh</creatorcontrib><creatorcontrib>Don-Lin Yang</creatorcontrib><creatorcontrib>Jungpin Wu</creatorcontrib><collection>IEEE Electronic Library (IEL) Conference Proceedings</collection><collection>IEEE Proceedings Order Plan All Online (POP All Online) 1998-present by volume</collection><collection>IEEE Xplore All Conference Proceedings</collection><collection>IEEE/IET Electronic Library</collection><collection>IEEE Proceedings Order Plans (POP All) 1998-Present</collection></facets><delivery><delcategory>Remote Search Resource</delcategory><fulltext>fulltext_linktorsrc</fulltext></delivery><addata><au>Chia-Ying Hsieh</au><au>Don-Lin Yang</au><au>Jungpin Wu</au><format>book</format><genre>proceeding</genre><ristype>CONF</ristype><atitle>An Efficient Sequential Pattern Mining Algorithm Based on the 2-Sequence Matrix</atitle><btitle>2008 IEEE International Conference on Data Mining Workshops</btitle><stitle>ICDMW</stitle><date>2008-12</date><risdate>2008</risdate><spage>583</spage><epage>591</epage><pages>583-591</pages><issn>2375-9232</issn><eissn>2375-9259</eissn><eisbn>0769535038</eisbn><eisbn>9780769535036</eisbn><abstract>Sequential pattern mining has become more and more popular in recent years due to its wide applications and the fact that it can find more information than association rules. Two famous algorithms in sequential pattern mining are AprioriAll and PrefixSpan. These two algorithms not only need to scan a database or projected-databases many times, but also require setting a minimal support threshold to prune infrequent data to obtain useful sequential patterns efficiently. In addition, they must rescan the database if new items or sequences are added. In this paper, we propose a novel algorithm called efficient sequential pattern enumeration (ESPE) to solve the above problems. In addition, our method can be applied in many applications, such as for the itemsets appearing at the same time in a sequence. In our experiments, we show that the performance of ESPE is better than the other two methods using various datasets.</abstract><pub>IEEE</pub><doi>10.1109/ICDMW.2008.82</doi><tpages>9</tpages></addata></record>
fulltext fulltext_linktorsrc
identifier ISSN: 2375-9232
ispartof 2008 IEEE International Conference on Data Mining Workshops, 2008, p.583-591
issn 2375-9232
2375-9259
language eng
recordid cdi_ieee_primary_4733982
source IEEE Xplore All Conference Series
subjects Application software
association rule
Association rules
Bioinformatics
candidate enumeration
Computer science
Conferences
Data engineering
Data mining
Itemsets
minimum support
Sequential pattern
Statistics
title An Efficient Sequential Pattern Mining Algorithm Based on the 2-Sequence Matrix
url http://sfxeu10.hosted.exlibrisgroup.com/loughborough?ctx_ver=Z39.88-2004&ctx_enc=info:ofi/enc:UTF-8&ctx_tim=2024-12-30T14%3A27%3A17IST&url_ver=Z39.88-2004&url_ctx_fmt=infofi/fmt:kev:mtx:ctx&rfr_id=info:sid/primo.exlibrisgroup.com:primo3-Article-ieee_CHZPO&rft_val_fmt=info:ofi/fmt:kev:mtx:book&rft.genre=proceeding&rft.atitle=An%20Efficient%20Sequential%20Pattern%20Mining%20Algorithm%20Based%20on%20the%202-Sequence%20Matrix&rft.btitle=2008%20IEEE%20International%20Conference%20on%20Data%20Mining%20Workshops&rft.au=Chia-Ying%20Hsieh&rft.date=2008-12&rft.spage=583&rft.epage=591&rft.pages=583-591&rft.issn=2375-9232&rft.eissn=2375-9259&rft_id=info:doi/10.1109/ICDMW.2008.82&rft.eisbn=0769535038&rft.eisbn_list=9780769535036&rft_dat=%3Cieee_CHZPO%3E4733982%3C/ieee_CHZPO%3E%3Cgrp_id%3Ecdi_FETCH-LOGICAL-i90t-10553b8efe3f1604c5e45bc62a15bbe495a997c873398d4f98950027c33b11e73%3C/grp_id%3E%3Coa%3E%3C/oa%3E%3Curl%3E%3C/url%3E&rft_id=info:oai/&rft_id=info:pmid/&rft_ieee_id=4733982&rfr_iscdi=true