Loading…

Scalable Filtering of Multiple Generalized-Tree-Pattern Queries over XML Streams

An XML publish/subscribe system needs to filter a large number of queries over XML streams. Most existing systems only consider filtering the simple XPath statements. In this paper, we focus on filtering of the more complex generalized-tree-pattern (GTP) queries. Our filtering mechanism is based on...

Full description

Saved in:
Bibliographic Details
Published in:IEEE transactions on knowledge and data engineering 2008-12, Vol.20 (12), p.1627-1640
Main Authors: Songting Chen, Hua-Gang Li, Tatemura, J., Wang-Pin Hsiung, Agrawal, D., Candan, K.S.
Format: Article
Language:English
Subjects:
Citations: Items that this one cites
Items that cite this one
Online Access:Get full text
Tags: Add Tag
No Tags, Be the first to tag this record!
cited_by cdi_FETCH-LOGICAL-c375t-79644078aac1eeb43fa5718201a02cfe7d3fae2d0f50c8bcc36412cd7bc95c763
cites cdi_FETCH-LOGICAL-c375t-79644078aac1eeb43fa5718201a02cfe7d3fae2d0f50c8bcc36412cd7bc95c763
container_end_page 1640
container_issue 12
container_start_page 1627
container_title IEEE transactions on knowledge and data engineering
container_volume 20
creator Songting Chen
Hua-Gang Li
Tatemura, J.
Wang-Pin Hsiung
Agrawal, D.
Candan, K.S.
description An XML publish/subscribe system needs to filter a large number of queries over XML streams. Most existing systems only consider filtering the simple XPath statements. In this paper, we focus on filtering of the more complex generalized-tree-pattern (GTP) queries. Our filtering mechanism is based on a novel Tree-of-Path (TOP) encoding scheme, which compactly represents the path matches for the entire document. First, we show that the TOP encodings can be efficiently produced via a shared bottom-up path matching. Second, with the aid of this TOP encoding, we can (1) achieve polynomial time and space complexity for post processing, (2) avoid redundant predicate evaluations, (3) allow an efficient duplicate-free and merge join-based algorithm for merging multiple encoded path matches and (4) simplify the processing of GTP queries. Overall our approach maximizes the sharing opportunity across queries by exploiting the suffix as well as prefix sharing. At the same time, our TOP encodings allow efficient post processing for GTP queries. Extensive performance studies show that our GFilter solution not only achieves significantly better filtering performance than state-of-the-art algorithms, but also is capable of efficiently filtering the more complex GTP queries.
doi_str_mv 10.1109/TKDE.2008.83
format article
fullrecord <record><control><sourceid>proquest_ieee_</sourceid><recordid>TN_cdi_proquest_miscellaneous_34677884</recordid><sourceformat>XML</sourceformat><sourcesystem>PC</sourcesystem><ieee_id>4509431</ieee_id><sourcerecordid>34677884</sourcerecordid><originalsourceid>FETCH-LOGICAL-c375t-79644078aac1eeb43fa5718201a02cfe7d3fae2d0f50c8bcc36412cd7bc95c763</originalsourceid><addsrcrecordid>eNp90c9LwzAUB_AiCs7pzZuXIqgXO19-LelR5jbFDSeb4C1k2at0dO1MWkH_ejM2dvDgKY_3PnmQfKPonECHEEjvZs8P_Q4FUB3FDqIWEUIllKTkMNTAScIZl8fRifdLCEgq0oomU2sKMy8wHuRFjS4vP-Iqi8dNUefr0B1iic4U-Q8ukplDTCamDqyMX5uA0cfVF7r4fTyKp7VDs_Kn0VFmCo9nu7MdvQ36s95jMnoZPvXuR4llUtSJTLucg1TGWII45ywzQhJFgRigNkO5CB2kC8gEWDW3lnU5oXYh5zYVVnZZO7rZ7l276rNBX-tV7i0WhSmxarxWUgCnXNEgr_-VjHelVIoHePkHLqvGleEVOiWUypQrFdDtFllXee8w02uXr4z71gT0JgW9SUFvUtCKBX6122l8-OnMmdLmfn-HghKgCAR3sXU5Iu7HXEDKGWG_SpiO0A</addsrcrecordid><sourcetype>Aggregation Database</sourcetype><iscdi>true</iscdi><recordtype>article</recordtype><pqid>912279488</pqid></control><display><type>article</type><title>Scalable Filtering of Multiple Generalized-Tree-Pattern Queries over XML Streams</title><source>IEEE Xplore (Online service)</source><creator>Songting Chen ; Hua-Gang Li ; Tatemura, J. ; Wang-Pin Hsiung ; Agrawal, D. ; Candan, K.S.</creator><creatorcontrib>Songting Chen ; Hua-Gang Li ; Tatemura, J. ; Wang-Pin Hsiung ; Agrawal, D. ; Candan, K.S.</creatorcontrib><description>An XML publish/subscribe system needs to filter a large number of queries over XML streams. Most existing systems only consider filtering the simple XPath statements. In this paper, we focus on filtering of the more complex generalized-tree-pattern (GTP) queries. Our filtering mechanism is based on a novel Tree-of-Path (TOP) encoding scheme, which compactly represents the path matches for the entire document. First, we show that the TOP encodings can be efficiently produced via a shared bottom-up path matching. Second, with the aid of this TOP encoding, we can (1) achieve polynomial time and space complexity for post processing, (2) avoid redundant predicate evaluations, (3) allow an efficient duplicate-free and merge join-based algorithm for merging multiple encoded path matches and (4) simplify the processing of GTP queries. Overall our approach maximizes the sharing opportunity across queries by exploiting the suffix as well as prefix sharing. At the same time, our TOP encodings allow efficient post processing for GTP queries. Extensive performance studies show that our GFilter solution not only achieves significantly better filtering performance than state-of-the-art algorithms, but also is capable of efficiently filtering the more complex GTP queries.</description><identifier>ISSN: 1041-4347</identifier><identifier>EISSN: 1558-2191</identifier><identifier>DOI: 10.1109/TKDE.2008.83</identifier><identifier>CODEN: ITKEEH</identifier><language>eng</language><publisher>New York, NY: IEEE</publisher><subject>Algorithms ; Applied sciences ; Computer science; control theory; systems ; Data processing. List processing. Character string processing ; Database languages ; Encoding ; Exact sciences and technology ; Extensible Markup Language ; Filtering ; Filtering algorithms ; Filtration ; generalized-tree-pattern queries ; Information systems. Data bases ; Matched filters ; Memory organisation. Data processing ; Merging ; Polynomials ; Queries ; result encoding ; Scalability ; Software ; Streams ; Studies ; Telecommunication traffic ; Web services ; XML ; XML filtering ; XML streams</subject><ispartof>IEEE transactions on knowledge and data engineering, 2008-12, Vol.20 (12), p.1627-1640</ispartof><rights>2009 INIST-CNRS</rights><rights>Copyright The Institute of Electrical and Electronics Engineers, Inc. (IEEE) 2008</rights><lds50>peer_reviewed</lds50><woscitedreferencessubscribed>false</woscitedreferencessubscribed><citedby>FETCH-LOGICAL-c375t-79644078aac1eeb43fa5718201a02cfe7d3fae2d0f50c8bcc36412cd7bc95c763</citedby><cites>FETCH-LOGICAL-c375t-79644078aac1eeb43fa5718201a02cfe7d3fae2d0f50c8bcc36412cd7bc95c763</cites></display><links><openurl>$$Topenurl_article</openurl><openurlfulltext>$$Topenurlfull_article</openurlfulltext><thumbnail>$$Tsyndetics_thumb_exl</thumbnail><linktohtml>$$Uhttps://ieeexplore.ieee.org/document/4509431$$EHTML$$P50$$Gieee$$H</linktohtml><link.rule.ids>314,780,784,27924,27925,54796</link.rule.ids><backlink>$$Uhttp://pascal-francis.inist.fr/vibad/index.php?action=getRecordDetail&amp;idt=20850810$$DView record in Pascal Francis$$Hfree_for_read</backlink></links><search><creatorcontrib>Songting Chen</creatorcontrib><creatorcontrib>Hua-Gang Li</creatorcontrib><creatorcontrib>Tatemura, J.</creatorcontrib><creatorcontrib>Wang-Pin Hsiung</creatorcontrib><creatorcontrib>Agrawal, D.</creatorcontrib><creatorcontrib>Candan, K.S.</creatorcontrib><title>Scalable Filtering of Multiple Generalized-Tree-Pattern Queries over XML Streams</title><title>IEEE transactions on knowledge and data engineering</title><addtitle>TKDE</addtitle><description>An XML publish/subscribe system needs to filter a large number of queries over XML streams. Most existing systems only consider filtering the simple XPath statements. In this paper, we focus on filtering of the more complex generalized-tree-pattern (GTP) queries. Our filtering mechanism is based on a novel Tree-of-Path (TOP) encoding scheme, which compactly represents the path matches for the entire document. First, we show that the TOP encodings can be efficiently produced via a shared bottom-up path matching. Second, with the aid of this TOP encoding, we can (1) achieve polynomial time and space complexity for post processing, (2) avoid redundant predicate evaluations, (3) allow an efficient duplicate-free and merge join-based algorithm for merging multiple encoded path matches and (4) simplify the processing of GTP queries. Overall our approach maximizes the sharing opportunity across queries by exploiting the suffix as well as prefix sharing. At the same time, our TOP encodings allow efficient post processing for GTP queries. Extensive performance studies show that our GFilter solution not only achieves significantly better filtering performance than state-of-the-art algorithms, but also is capable of efficiently filtering the more complex GTP queries.</description><subject>Algorithms</subject><subject>Applied sciences</subject><subject>Computer science; control theory; systems</subject><subject>Data processing. List processing. Character string processing</subject><subject>Database languages</subject><subject>Encoding</subject><subject>Exact sciences and technology</subject><subject>Extensible Markup Language</subject><subject>Filtering</subject><subject>Filtering algorithms</subject><subject>Filtration</subject><subject>generalized-tree-pattern queries</subject><subject>Information systems. Data bases</subject><subject>Matched filters</subject><subject>Memory organisation. Data processing</subject><subject>Merging</subject><subject>Polynomials</subject><subject>Queries</subject><subject>result encoding</subject><subject>Scalability</subject><subject>Software</subject><subject>Streams</subject><subject>Studies</subject><subject>Telecommunication traffic</subject><subject>Web services</subject><subject>XML</subject><subject>XML filtering</subject><subject>XML streams</subject><issn>1041-4347</issn><issn>1558-2191</issn><fulltext>true</fulltext><rsrctype>article</rsrctype><creationdate>2008</creationdate><recordtype>article</recordtype><recordid>eNp90c9LwzAUB_AiCs7pzZuXIqgXO19-LelR5jbFDSeb4C1k2at0dO1MWkH_ejM2dvDgKY_3PnmQfKPonECHEEjvZs8P_Q4FUB3FDqIWEUIllKTkMNTAScIZl8fRifdLCEgq0oomU2sKMy8wHuRFjS4vP-Iqi8dNUefr0B1iic4U-Q8ukplDTCamDqyMX5uA0cfVF7r4fTyKp7VDs_Kn0VFmCo9nu7MdvQ36s95jMnoZPvXuR4llUtSJTLucg1TGWII45ywzQhJFgRigNkO5CB2kC8gEWDW3lnU5oXYh5zYVVnZZO7rZ7l276rNBX-tV7i0WhSmxarxWUgCnXNEgr_-VjHelVIoHePkHLqvGleEVOiWUypQrFdDtFllXee8w02uXr4z71gT0JgW9SUFvUtCKBX6122l8-OnMmdLmfn-HghKgCAR3sXU5Iu7HXEDKGWG_SpiO0A</recordid><startdate>20081201</startdate><enddate>20081201</enddate><creator>Songting Chen</creator><creator>Hua-Gang Li</creator><creator>Tatemura, J.</creator><creator>Wang-Pin Hsiung</creator><creator>Agrawal, D.</creator><creator>Candan, K.S.</creator><general>IEEE</general><general>IEEE Computer Society</general><general>The Institute of Electrical and Electronics Engineers, Inc. (IEEE)</general><scope>97E</scope><scope>RIA</scope><scope>RIE</scope><scope>IQODW</scope><scope>AAYXX</scope><scope>CITATION</scope><scope>7SC</scope><scope>7SP</scope><scope>8FD</scope><scope>JQ2</scope><scope>L7M</scope><scope>L~C</scope><scope>L~D</scope><scope>F28</scope><scope>FR3</scope></search><sort><creationdate>20081201</creationdate><title>Scalable Filtering of Multiple Generalized-Tree-Pattern Queries over XML Streams</title><author>Songting Chen ; Hua-Gang Li ; Tatemura, J. ; Wang-Pin Hsiung ; Agrawal, D. ; Candan, K.S.</author></sort><facets><frbrtype>5</frbrtype><frbrgroupid>cdi_FETCH-LOGICAL-c375t-79644078aac1eeb43fa5718201a02cfe7d3fae2d0f50c8bcc36412cd7bc95c763</frbrgroupid><rsrctype>articles</rsrctype><prefilter>articles</prefilter><language>eng</language><creationdate>2008</creationdate><topic>Algorithms</topic><topic>Applied sciences</topic><topic>Computer science; control theory; systems</topic><topic>Data processing. List processing. Character string processing</topic><topic>Database languages</topic><topic>Encoding</topic><topic>Exact sciences and technology</topic><topic>Extensible Markup Language</topic><topic>Filtering</topic><topic>Filtering algorithms</topic><topic>Filtration</topic><topic>generalized-tree-pattern queries</topic><topic>Information systems. Data bases</topic><topic>Matched filters</topic><topic>Memory organisation. Data processing</topic><topic>Merging</topic><topic>Polynomials</topic><topic>Queries</topic><topic>result encoding</topic><topic>Scalability</topic><topic>Software</topic><topic>Streams</topic><topic>Studies</topic><topic>Telecommunication traffic</topic><topic>Web services</topic><topic>XML</topic><topic>XML filtering</topic><topic>XML streams</topic><toplevel>peer_reviewed</toplevel><toplevel>online_resources</toplevel><creatorcontrib>Songting Chen</creatorcontrib><creatorcontrib>Hua-Gang Li</creatorcontrib><creatorcontrib>Tatemura, J.</creatorcontrib><creatorcontrib>Wang-Pin Hsiung</creatorcontrib><creatorcontrib>Agrawal, D.</creatorcontrib><creatorcontrib>Candan, K.S.</creatorcontrib><collection>IEEE All-Society Periodicals Package (ASPP) 2005-present</collection><collection>IEEE All-Society Periodicals Package (ASPP) 1998-Present</collection><collection>IEEE Xplore (Online service)</collection><collection>Pascal-Francis</collection><collection>CrossRef</collection><collection>Computer and Information Systems Abstracts</collection><collection>Electronics &amp; Communications Abstracts</collection><collection>Technology Research Database</collection><collection>ProQuest Computer Science Collection</collection><collection>Advanced Technologies Database with Aerospace</collection><collection>Computer and Information Systems Abstracts – Academic</collection><collection>Computer and Information Systems Abstracts Professional</collection><collection>ANTE: Abstracts in New Technology &amp; Engineering</collection><collection>Engineering Research Database</collection><jtitle>IEEE transactions on knowledge and data engineering</jtitle></facets><delivery><delcategory>Remote Search Resource</delcategory><fulltext>fulltext</fulltext></delivery><addata><au>Songting Chen</au><au>Hua-Gang Li</au><au>Tatemura, J.</au><au>Wang-Pin Hsiung</au><au>Agrawal, D.</au><au>Candan, K.S.</au><format>journal</format><genre>article</genre><ristype>JOUR</ristype><atitle>Scalable Filtering of Multiple Generalized-Tree-Pattern Queries over XML Streams</atitle><jtitle>IEEE transactions on knowledge and data engineering</jtitle><stitle>TKDE</stitle><date>2008-12-01</date><risdate>2008</risdate><volume>20</volume><issue>12</issue><spage>1627</spage><epage>1640</epage><pages>1627-1640</pages><issn>1041-4347</issn><eissn>1558-2191</eissn><coden>ITKEEH</coden><abstract>An XML publish/subscribe system needs to filter a large number of queries over XML streams. Most existing systems only consider filtering the simple XPath statements. In this paper, we focus on filtering of the more complex generalized-tree-pattern (GTP) queries. Our filtering mechanism is based on a novel Tree-of-Path (TOP) encoding scheme, which compactly represents the path matches for the entire document. First, we show that the TOP encodings can be efficiently produced via a shared bottom-up path matching. Second, with the aid of this TOP encoding, we can (1) achieve polynomial time and space complexity for post processing, (2) avoid redundant predicate evaluations, (3) allow an efficient duplicate-free and merge join-based algorithm for merging multiple encoded path matches and (4) simplify the processing of GTP queries. Overall our approach maximizes the sharing opportunity across queries by exploiting the suffix as well as prefix sharing. At the same time, our TOP encodings allow efficient post processing for GTP queries. Extensive performance studies show that our GFilter solution not only achieves significantly better filtering performance than state-of-the-art algorithms, but also is capable of efficiently filtering the more complex GTP queries.</abstract><cop>New York, NY</cop><pub>IEEE</pub><doi>10.1109/TKDE.2008.83</doi><tpages>14</tpages></addata></record>
fulltext fulltext
identifier ISSN: 1041-4347
ispartof IEEE transactions on knowledge and data engineering, 2008-12, Vol.20 (12), p.1627-1640
issn 1041-4347
1558-2191
language eng
recordid cdi_proquest_miscellaneous_34677884
source IEEE Xplore (Online service)
subjects Algorithms
Applied sciences
Computer science
control theory
systems
Data processing. List processing. Character string processing
Database languages
Encoding
Exact sciences and technology
Extensible Markup Language
Filtering
Filtering algorithms
Filtration
generalized-tree-pattern queries
Information systems. Data bases
Matched filters
Memory organisation. Data processing
Merging
Polynomials
Queries
result encoding
Scalability
Software
Streams
Studies
Telecommunication traffic
Web services
XML
XML filtering
XML streams
title Scalable Filtering of Multiple Generalized-Tree-Pattern Queries over XML Streams
url http://sfxeu10.hosted.exlibrisgroup.com/loughborough?ctx_ver=Z39.88-2004&ctx_enc=info:ofi/enc:UTF-8&ctx_tim=2024-12-27T12%3A09%3A23IST&url_ver=Z39.88-2004&url_ctx_fmt=infofi/fmt:kev:mtx:ctx&rfr_id=info:sid/primo.exlibrisgroup.com:primo3-Article-proquest_ieee_&rft_val_fmt=info:ofi/fmt:kev:mtx:journal&rft.genre=article&rft.atitle=Scalable%20Filtering%20of%20Multiple%20Generalized-Tree-Pattern%20Queries%20over%20XML%20Streams&rft.jtitle=IEEE%20transactions%20on%20knowledge%20and%20data%20engineering&rft.au=Songting%20Chen&rft.date=2008-12-01&rft.volume=20&rft.issue=12&rft.spage=1627&rft.epage=1640&rft.pages=1627-1640&rft.issn=1041-4347&rft.eissn=1558-2191&rft.coden=ITKEEH&rft_id=info:doi/10.1109/TKDE.2008.83&rft_dat=%3Cproquest_ieee_%3E34677884%3C/proquest_ieee_%3E%3Cgrp_id%3Ecdi_FETCH-LOGICAL-c375t-79644078aac1eeb43fa5718201a02cfe7d3fae2d0f50c8bcc36412cd7bc95c763%3C/grp_id%3E%3Coa%3E%3C/oa%3E%3Curl%3E%3C/url%3E&rft_id=info:oai/&rft_pqid=912279488&rft_id=info:pmid/&rft_ieee_id=4509431&rfr_iscdi=true