Loading…
Scalable Filtering of Multiple Generalized-Tree-Pattern Queries over XML Streams
An XML publish/subscribe system needs to filter a large number of queries over XML streams. Most existing systems only consider filtering the simple XPath statements. In this paper, we focus on filtering of the more complex generalized-tree-pattern (GTP) queries. Our filtering mechanism is based on...
Saved in:
Published in: | IEEE transactions on knowledge and data engineering 2008-12, Vol.20 (12), p.1627-1640 |
---|---|
Main Authors: | , , , , , |
Format: | Article |
Language: | English |
Subjects: | |
Citations: | Items that this one cites Items that cite this one |
Online Access: | Get full text |
Tags: |
Add Tag
No Tags, Be the first to tag this record!
|
cited_by | cdi_FETCH-LOGICAL-c375t-79644078aac1eeb43fa5718201a02cfe7d3fae2d0f50c8bcc36412cd7bc95c763 |
---|---|
cites | cdi_FETCH-LOGICAL-c375t-79644078aac1eeb43fa5718201a02cfe7d3fae2d0f50c8bcc36412cd7bc95c763 |
container_end_page | 1640 |
container_issue | 12 |
container_start_page | 1627 |
container_title | IEEE transactions on knowledge and data engineering |
container_volume | 20 |
creator | Songting Chen Hua-Gang Li Tatemura, J. Wang-Pin Hsiung Agrawal, D. Candan, K.S. |
description | An XML publish/subscribe system needs to filter a large number of queries over XML streams. Most existing systems only consider filtering the simple XPath statements. In this paper, we focus on filtering of the more complex generalized-tree-pattern (GTP) queries. Our filtering mechanism is based on a novel Tree-of-Path (TOP) encoding scheme, which compactly represents the path matches for the entire document. First, we show that the TOP encodings can be efficiently produced via a shared bottom-up path matching. Second, with the aid of this TOP encoding, we can (1) achieve polynomial time and space complexity for post processing, (2) avoid redundant predicate evaluations, (3) allow an efficient duplicate-free and merge join-based algorithm for merging multiple encoded path matches and (4) simplify the processing of GTP queries. Overall our approach maximizes the sharing opportunity across queries by exploiting the suffix as well as prefix sharing. At the same time, our TOP encodings allow efficient post processing for GTP queries. Extensive performance studies show that our GFilter solution not only achieves significantly better filtering performance than state-of-the-art algorithms, but also is capable of efficiently filtering the more complex GTP queries. |
doi_str_mv | 10.1109/TKDE.2008.83 |
format | article |
fullrecord | <record><control><sourceid>proquest_ieee_</sourceid><recordid>TN_cdi_proquest_miscellaneous_34677884</recordid><sourceformat>XML</sourceformat><sourcesystem>PC</sourcesystem><ieee_id>4509431</ieee_id><sourcerecordid>34677884</sourcerecordid><originalsourceid>FETCH-LOGICAL-c375t-79644078aac1eeb43fa5718201a02cfe7d3fae2d0f50c8bcc36412cd7bc95c763</originalsourceid><addsrcrecordid>eNp90c9LwzAUB_AiCs7pzZuXIqgXO19-LelR5jbFDSeb4C1k2at0dO1MWkH_ejM2dvDgKY_3PnmQfKPonECHEEjvZs8P_Q4FUB3FDqIWEUIllKTkMNTAScIZl8fRifdLCEgq0oomU2sKMy8wHuRFjS4vP-Iqi8dNUefr0B1iic4U-Q8ukplDTCamDqyMX5uA0cfVF7r4fTyKp7VDs_Kn0VFmCo9nu7MdvQ36s95jMnoZPvXuR4llUtSJTLucg1TGWII45ywzQhJFgRigNkO5CB2kC8gEWDW3lnU5oXYh5zYVVnZZO7rZ7l276rNBX-tV7i0WhSmxarxWUgCnXNEgr_-VjHelVIoHePkHLqvGleEVOiWUypQrFdDtFllXee8w02uXr4z71gT0JgW9SUFvUtCKBX6122l8-OnMmdLmfn-HghKgCAR3sXU5Iu7HXEDKGWG_SpiO0A</addsrcrecordid><sourcetype>Aggregation Database</sourcetype><iscdi>true</iscdi><recordtype>article</recordtype><pqid>912279488</pqid></control><display><type>article</type><title>Scalable Filtering of Multiple Generalized-Tree-Pattern Queries over XML Streams</title><source>IEEE Xplore (Online service)</source><creator>Songting Chen ; Hua-Gang Li ; Tatemura, J. ; Wang-Pin Hsiung ; Agrawal, D. ; Candan, K.S.</creator><creatorcontrib>Songting Chen ; Hua-Gang Li ; Tatemura, J. ; Wang-Pin Hsiung ; Agrawal, D. ; Candan, K.S.</creatorcontrib><description>An XML publish/subscribe system needs to filter a large number of queries over XML streams. Most existing systems only consider filtering the simple XPath statements. In this paper, we focus on filtering of the more complex generalized-tree-pattern (GTP) queries. Our filtering mechanism is based on a novel Tree-of-Path (TOP) encoding scheme, which compactly represents the path matches for the entire document. First, we show that the TOP encodings can be efficiently produced via a shared bottom-up path matching. Second, with the aid of this TOP encoding, we can (1) achieve polynomial time and space complexity for post processing, (2) avoid redundant predicate evaluations, (3) allow an efficient duplicate-free and merge join-based algorithm for merging multiple encoded path matches and (4) simplify the processing of GTP queries. Overall our approach maximizes the sharing opportunity across queries by exploiting the suffix as well as prefix sharing. At the same time, our TOP encodings allow efficient post processing for GTP queries. Extensive performance studies show that our GFilter solution not only achieves significantly better filtering performance than state-of-the-art algorithms, but also is capable of efficiently filtering the more complex GTP queries.</description><identifier>ISSN: 1041-4347</identifier><identifier>EISSN: 1558-2191</identifier><identifier>DOI: 10.1109/TKDE.2008.83</identifier><identifier>CODEN: ITKEEH</identifier><language>eng</language><publisher>New York, NY: IEEE</publisher><subject>Algorithms ; Applied sciences ; Computer science; control theory; systems ; Data processing. List processing. Character string processing ; Database languages ; Encoding ; Exact sciences and technology ; Extensible Markup Language ; Filtering ; Filtering algorithms ; Filtration ; generalized-tree-pattern queries ; Information systems. Data bases ; Matched filters ; Memory organisation. Data processing ; Merging ; Polynomials ; Queries ; result encoding ; Scalability ; Software ; Streams ; Studies ; Telecommunication traffic ; Web services ; XML ; XML filtering ; XML streams</subject><ispartof>IEEE transactions on knowledge and data engineering, 2008-12, Vol.20 (12), p.1627-1640</ispartof><rights>2009 INIST-CNRS</rights><rights>Copyright The Institute of Electrical and Electronics Engineers, Inc. (IEEE) 2008</rights><lds50>peer_reviewed</lds50><woscitedreferencessubscribed>false</woscitedreferencessubscribed><citedby>FETCH-LOGICAL-c375t-79644078aac1eeb43fa5718201a02cfe7d3fae2d0f50c8bcc36412cd7bc95c763</citedby><cites>FETCH-LOGICAL-c375t-79644078aac1eeb43fa5718201a02cfe7d3fae2d0f50c8bcc36412cd7bc95c763</cites></display><links><openurl>$$Topenurl_article</openurl><openurlfulltext>$$Topenurlfull_article</openurlfulltext><thumbnail>$$Tsyndetics_thumb_exl</thumbnail><linktohtml>$$Uhttps://ieeexplore.ieee.org/document/4509431$$EHTML$$P50$$Gieee$$H</linktohtml><link.rule.ids>314,780,784,27924,27925,54796</link.rule.ids><backlink>$$Uhttp://pascal-francis.inist.fr/vibad/index.php?action=getRecordDetail&idt=20850810$$DView record in Pascal Francis$$Hfree_for_read</backlink></links><search><creatorcontrib>Songting Chen</creatorcontrib><creatorcontrib>Hua-Gang Li</creatorcontrib><creatorcontrib>Tatemura, J.</creatorcontrib><creatorcontrib>Wang-Pin Hsiung</creatorcontrib><creatorcontrib>Agrawal, D.</creatorcontrib><creatorcontrib>Candan, K.S.</creatorcontrib><title>Scalable Filtering of Multiple Generalized-Tree-Pattern Queries over XML Streams</title><title>IEEE transactions on knowledge and data engineering</title><addtitle>TKDE</addtitle><description>An XML publish/subscribe system needs to filter a large number of queries over XML streams. Most existing systems only consider filtering the simple XPath statements. In this paper, we focus on filtering of the more complex generalized-tree-pattern (GTP) queries. Our filtering mechanism is based on a novel Tree-of-Path (TOP) encoding scheme, which compactly represents the path matches for the entire document. First, we show that the TOP encodings can be efficiently produced via a shared bottom-up path matching. Second, with the aid of this TOP encoding, we can (1) achieve polynomial time and space complexity for post processing, (2) avoid redundant predicate evaluations, (3) allow an efficient duplicate-free and merge join-based algorithm for merging multiple encoded path matches and (4) simplify the processing of GTP queries. Overall our approach maximizes the sharing opportunity across queries by exploiting the suffix as well as prefix sharing. At the same time, our TOP encodings allow efficient post processing for GTP queries. Extensive performance studies show that our GFilter solution not only achieves significantly better filtering performance than state-of-the-art algorithms, but also is capable of efficiently filtering the more complex GTP queries.</description><subject>Algorithms</subject><subject>Applied sciences</subject><subject>Computer science; control theory; systems</subject><subject>Data processing. List processing. Character string processing</subject><subject>Database languages</subject><subject>Encoding</subject><subject>Exact sciences and technology</subject><subject>Extensible Markup Language</subject><subject>Filtering</subject><subject>Filtering algorithms</subject><subject>Filtration</subject><subject>generalized-tree-pattern queries</subject><subject>Information systems. Data bases</subject><subject>Matched filters</subject><subject>Memory organisation. Data processing</subject><subject>Merging</subject><subject>Polynomials</subject><subject>Queries</subject><subject>result encoding</subject><subject>Scalability</subject><subject>Software</subject><subject>Streams</subject><subject>Studies</subject><subject>Telecommunication traffic</subject><subject>Web services</subject><subject>XML</subject><subject>XML filtering</subject><subject>XML streams</subject><issn>1041-4347</issn><issn>1558-2191</issn><fulltext>true</fulltext><rsrctype>article</rsrctype><creationdate>2008</creationdate><recordtype>article</recordtype><recordid>eNp90c9LwzAUB_AiCs7pzZuXIqgXO19-LelR5jbFDSeb4C1k2at0dO1MWkH_ejM2dvDgKY_3PnmQfKPonECHEEjvZs8P_Q4FUB3FDqIWEUIllKTkMNTAScIZl8fRifdLCEgq0oomU2sKMy8wHuRFjS4vP-Iqi8dNUefr0B1iic4U-Q8ukplDTCamDqyMX5uA0cfVF7r4fTyKp7VDs_Kn0VFmCo9nu7MdvQ36s95jMnoZPvXuR4llUtSJTLucg1TGWII45ywzQhJFgRigNkO5CB2kC8gEWDW3lnU5oXYh5zYVVnZZO7rZ7l276rNBX-tV7i0WhSmxarxWUgCnXNEgr_-VjHelVIoHePkHLqvGleEVOiWUypQrFdDtFllXee8w02uXr4z71gT0JgW9SUFvUtCKBX6122l8-OnMmdLmfn-HghKgCAR3sXU5Iu7HXEDKGWG_SpiO0A</recordid><startdate>20081201</startdate><enddate>20081201</enddate><creator>Songting Chen</creator><creator>Hua-Gang Li</creator><creator>Tatemura, J.</creator><creator>Wang-Pin Hsiung</creator><creator>Agrawal, D.</creator><creator>Candan, K.S.</creator><general>IEEE</general><general>IEEE Computer Society</general><general>The Institute of Electrical and Electronics Engineers, Inc. (IEEE)</general><scope>97E</scope><scope>RIA</scope><scope>RIE</scope><scope>IQODW</scope><scope>AAYXX</scope><scope>CITATION</scope><scope>7SC</scope><scope>7SP</scope><scope>8FD</scope><scope>JQ2</scope><scope>L7M</scope><scope>L~C</scope><scope>L~D</scope><scope>F28</scope><scope>FR3</scope></search><sort><creationdate>20081201</creationdate><title>Scalable Filtering of Multiple Generalized-Tree-Pattern Queries over XML Streams</title><author>Songting Chen ; Hua-Gang Li ; Tatemura, J. ; Wang-Pin Hsiung ; Agrawal, D. ; Candan, K.S.</author></sort><facets><frbrtype>5</frbrtype><frbrgroupid>cdi_FETCH-LOGICAL-c375t-79644078aac1eeb43fa5718201a02cfe7d3fae2d0f50c8bcc36412cd7bc95c763</frbrgroupid><rsrctype>articles</rsrctype><prefilter>articles</prefilter><language>eng</language><creationdate>2008</creationdate><topic>Algorithms</topic><topic>Applied sciences</topic><topic>Computer science; control theory; systems</topic><topic>Data processing. List processing. Character string processing</topic><topic>Database languages</topic><topic>Encoding</topic><topic>Exact sciences and technology</topic><topic>Extensible Markup Language</topic><topic>Filtering</topic><topic>Filtering algorithms</topic><topic>Filtration</topic><topic>generalized-tree-pattern queries</topic><topic>Information systems. Data bases</topic><topic>Matched filters</topic><topic>Memory organisation. Data processing</topic><topic>Merging</topic><topic>Polynomials</topic><topic>Queries</topic><topic>result encoding</topic><topic>Scalability</topic><topic>Software</topic><topic>Streams</topic><topic>Studies</topic><topic>Telecommunication traffic</topic><topic>Web services</topic><topic>XML</topic><topic>XML filtering</topic><topic>XML streams</topic><toplevel>peer_reviewed</toplevel><toplevel>online_resources</toplevel><creatorcontrib>Songting Chen</creatorcontrib><creatorcontrib>Hua-Gang Li</creatorcontrib><creatorcontrib>Tatemura, J.</creatorcontrib><creatorcontrib>Wang-Pin Hsiung</creatorcontrib><creatorcontrib>Agrawal, D.</creatorcontrib><creatorcontrib>Candan, K.S.</creatorcontrib><collection>IEEE All-Society Periodicals Package (ASPP) 2005-present</collection><collection>IEEE All-Society Periodicals Package (ASPP) 1998-Present</collection><collection>IEEE Xplore (Online service)</collection><collection>Pascal-Francis</collection><collection>CrossRef</collection><collection>Computer and Information Systems Abstracts</collection><collection>Electronics & Communications Abstracts</collection><collection>Technology Research Database</collection><collection>ProQuest Computer Science Collection</collection><collection>Advanced Technologies Database with Aerospace</collection><collection>Computer and Information Systems Abstracts – Academic</collection><collection>Computer and Information Systems Abstracts Professional</collection><collection>ANTE: Abstracts in New Technology & Engineering</collection><collection>Engineering Research Database</collection><jtitle>IEEE transactions on knowledge and data engineering</jtitle></facets><delivery><delcategory>Remote Search Resource</delcategory><fulltext>fulltext</fulltext></delivery><addata><au>Songting Chen</au><au>Hua-Gang Li</au><au>Tatemura, J.</au><au>Wang-Pin Hsiung</au><au>Agrawal, D.</au><au>Candan, K.S.</au><format>journal</format><genre>article</genre><ristype>JOUR</ristype><atitle>Scalable Filtering of Multiple Generalized-Tree-Pattern Queries over XML Streams</atitle><jtitle>IEEE transactions on knowledge and data engineering</jtitle><stitle>TKDE</stitle><date>2008-12-01</date><risdate>2008</risdate><volume>20</volume><issue>12</issue><spage>1627</spage><epage>1640</epage><pages>1627-1640</pages><issn>1041-4347</issn><eissn>1558-2191</eissn><coden>ITKEEH</coden><abstract>An XML publish/subscribe system needs to filter a large number of queries over XML streams. Most existing systems only consider filtering the simple XPath statements. In this paper, we focus on filtering of the more complex generalized-tree-pattern (GTP) queries. Our filtering mechanism is based on a novel Tree-of-Path (TOP) encoding scheme, which compactly represents the path matches for the entire document. First, we show that the TOP encodings can be efficiently produced via a shared bottom-up path matching. Second, with the aid of this TOP encoding, we can (1) achieve polynomial time and space complexity for post processing, (2) avoid redundant predicate evaluations, (3) allow an efficient duplicate-free and merge join-based algorithm for merging multiple encoded path matches and (4) simplify the processing of GTP queries. Overall our approach maximizes the sharing opportunity across queries by exploiting the suffix as well as prefix sharing. At the same time, our TOP encodings allow efficient post processing for GTP queries. Extensive performance studies show that our GFilter solution not only achieves significantly better filtering performance than state-of-the-art algorithms, but also is capable of efficiently filtering the more complex GTP queries.</abstract><cop>New York, NY</cop><pub>IEEE</pub><doi>10.1109/TKDE.2008.83</doi><tpages>14</tpages></addata></record> |
fulltext | fulltext |
identifier | ISSN: 1041-4347 |
ispartof | IEEE transactions on knowledge and data engineering, 2008-12, Vol.20 (12), p.1627-1640 |
issn | 1041-4347 1558-2191 |
language | eng |
recordid | cdi_proquest_miscellaneous_34677884 |
source | IEEE Xplore (Online service) |
subjects | Algorithms Applied sciences Computer science control theory systems Data processing. List processing. Character string processing Database languages Encoding Exact sciences and technology Extensible Markup Language Filtering Filtering algorithms Filtration generalized-tree-pattern queries Information systems. Data bases Matched filters Memory organisation. Data processing Merging Polynomials Queries result encoding Scalability Software Streams Studies Telecommunication traffic Web services XML XML filtering XML streams |
title | Scalable Filtering of Multiple Generalized-Tree-Pattern Queries over XML Streams |
url | http://sfxeu10.hosted.exlibrisgroup.com/loughborough?ctx_ver=Z39.88-2004&ctx_enc=info:ofi/enc:UTF-8&ctx_tim=2024-12-27T12%3A09%3A23IST&url_ver=Z39.88-2004&url_ctx_fmt=infofi/fmt:kev:mtx:ctx&rfr_id=info:sid/primo.exlibrisgroup.com:primo3-Article-proquest_ieee_&rft_val_fmt=info:ofi/fmt:kev:mtx:journal&rft.genre=article&rft.atitle=Scalable%20Filtering%20of%20Multiple%20Generalized-Tree-Pattern%20Queries%20over%20XML%20Streams&rft.jtitle=IEEE%20transactions%20on%20knowledge%20and%20data%20engineering&rft.au=Songting%20Chen&rft.date=2008-12-01&rft.volume=20&rft.issue=12&rft.spage=1627&rft.epage=1640&rft.pages=1627-1640&rft.issn=1041-4347&rft.eissn=1558-2191&rft.coden=ITKEEH&rft_id=info:doi/10.1109/TKDE.2008.83&rft_dat=%3Cproquest_ieee_%3E34677884%3C/proquest_ieee_%3E%3Cgrp_id%3Ecdi_FETCH-LOGICAL-c375t-79644078aac1eeb43fa5718201a02cfe7d3fae2d0f50c8bcc36412cd7bc95c763%3C/grp_id%3E%3Coa%3E%3C/oa%3E%3Curl%3E%3C/url%3E&rft_id=info:oai/&rft_pqid=912279488&rft_id=info:pmid/&rft_ieee_id=4509431&rfr_iscdi=true |