Loading…

On Producing High and Early Result Throughput in Multijoin Query Plans

This paper introduces an efficient framework for producing high and early result throughput in multijoin query plans. While most previous research focuses on optimizing for cases involving a single join operator, this work takes a radical step by addressing query plans with multiple join operators....

Full description

Saved in:
Bibliographic Details
Published in:IEEE transactions on knowledge and data engineering 2011-12, Vol.23 (12), p.1888-1902
Main Authors: Levandoski, J. K., Khalefa, M. E., Mokbel, M. F.
Format: Article
Language:English
Subjects:
Citations: Items that this one cites
Items that cite this one
Online Access:Get full text
Tags: Add Tag
No Tags, Be the first to tag this record!
cited_by cdi_FETCH-LOGICAL-c386t-ff57229e6dbecb9704e7f2ce4a081075a59c395dbe938868a9d05731bea190d43
cites cdi_FETCH-LOGICAL-c386t-ff57229e6dbecb9704e7f2ce4a081075a59c395dbe938868a9d05731bea190d43
container_end_page 1902
container_issue 12
container_start_page 1888
container_title IEEE transactions on knowledge and data engineering
container_volume 23
creator Levandoski, J. K.
Khalefa, M. E.
Mokbel, M. F.
description This paper introduces an efficient framework for producing high and early result throughput in multijoin query plans. While most previous research focuses on optimizing for cases involving a single join operator, this work takes a radical step by addressing query plans with multiple join operators. The proposed framework consists of two main methods, a flush algorithm and operator state manager. The framework assumes a symmetric hash join, a common method for producing early results, when processing incoming data. In this way, our methods can be applied to a group of previous join operators (optimized for single-join queries) when taking part in multijoin query plans. Specifically, our framework can be applied by 1) employing a new flushing policy to write in-memory data to disk, once memory allotment is exhausted, in a way that helps increase the probability of producing early result throughput in multijoin queries, and 2) employing a state manager that adaptively switches operators in the plan between joining in-memory data and disk-resident data in order to positively affect the early result throughput. Extensive experimental results show that the proposed methods outperform the state-of-the-art join operators optimized for both single and multijoin query plans.
doi_str_mv 10.1109/TKDE.2010.182
format article
fullrecord <record><control><sourceid>proquest_ieee_</sourceid><recordid>TN_cdi_ieee_primary_5590243</recordid><sourceformat>XML</sourceformat><sourcesystem>PC</sourcesystem><ieee_id>5590243</ieee_id><sourcerecordid>2552265381</sourcerecordid><originalsourceid>FETCH-LOGICAL-c386t-ff57229e6dbecb9704e7f2ce4a081075a59c395dbe938868a9d05731bea190d43</originalsourceid><addsrcrecordid>eNpdkEtPAjEUhRujiYguXblpTEzcDPY5bZdGQYwY0OC6KZ0ODBlmsJ0u-PeWQFi4uq_vnpwcAG4xGmCM1NP843U4IGg_SnIGephzmRGs8HnqEcMZo0xcgqsQ1gghKSTugdG0gTPfFtFWzRKOq-UKmqaAQ-PrHfx2IdYdnK98G5erbexg1cDPtKrWbeq-ovM7OKtNE67BRWnq4G6OtQ9-RsP5yzibTN_eX54nmaUy77Ky5IIQ5fJi4exCCcScKIl1zCCJkeCGK0sVT1dFpcylUQXiguKFM1ihgtE-eDzobn37G13o9KYK1tXJg2tj0JgyxQgmEif0_h-6bqNvkjutEkZyIfIEZQfI-jYE70q99dXG-J3GSO9D1ftQ9T5UnUJN_MNR1ARr6tKbxlbh9ESYYIIjnri7A1c5505nzhUijNI_xQt9xA</addsrcrecordid><sourcetype>Aggregation Database</sourcetype><iscdi>true</iscdi><recordtype>article</recordtype><pqid>913426776</pqid></control><display><type>article</type><title>On Producing High and Early Result Throughput in Multijoin Query Plans</title><source>IEEE Electronic Library (IEL) Journals</source><creator>Levandoski, J. K. ; Khalefa, M. E. ; Mokbel, M. F.</creator><creatorcontrib>Levandoski, J. K. ; Khalefa, M. E. ; Mokbel, M. F.</creatorcontrib><description>This paper introduces an efficient framework for producing high and early result throughput in multijoin query plans. While most previous research focuses on optimizing for cases involving a single join operator, this work takes a radical step by addressing query plans with multiple join operators. The proposed framework consists of two main methods, a flush algorithm and operator state manager. The framework assumes a symmetric hash join, a common method for producing early results, when processing incoming data. In this way, our methods can be applied to a group of previous join operators (optimized for single-join queries) when taking part in multijoin query plans. Specifically, our framework can be applied by 1) employing a new flushing policy to write in-memory data to disk, once memory allotment is exhausted, in a way that helps increase the probability of producing early result throughput in multijoin queries, and 2) employing a state manager that adaptively switches operators in the plan between joining in-memory data and disk-resident data in order to positively affect the early result throughput. Extensive experimental results show that the proposed methods outperform the state-of-the-art join operators optimized for both single and multijoin query plans.</description><identifier>ISSN: 1041-4347</identifier><identifier>EISSN: 1558-2191</identifier><identifier>DOI: 10.1109/TKDE.2010.182</identifier><identifier>CODEN: ITKEEH</identifier><language>eng</language><publisher>New York, NY: IEEE</publisher><subject>Algorithm design and analysis ; Applied sciences ; Computer science; control theory; systems ; Database management ; Database systems ; Disks ; Exact sciences and technology ; Flushing ; Information systems. Data bases ; Memory organisation. Data processing ; Operators ; Optimization ; Query processing ; Radicals ; Runtime ; Software ; Spatial databases ; State of the art ; Studies ; Switches ; systems</subject><ispartof>IEEE transactions on knowledge and data engineering, 2011-12, Vol.23 (12), p.1888-1902</ispartof><rights>2015 INIST-CNRS</rights><rights>Copyright The Institute of Electrical and Electronics Engineers, Inc. (IEEE) Dec 2011</rights><lds50>peer_reviewed</lds50><oa>free_for_read</oa><woscitedreferencessubscribed>false</woscitedreferencessubscribed><citedby>FETCH-LOGICAL-c386t-ff57229e6dbecb9704e7f2ce4a081075a59c395dbe938868a9d05731bea190d43</citedby><cites>FETCH-LOGICAL-c386t-ff57229e6dbecb9704e7f2ce4a081075a59c395dbe938868a9d05731bea190d43</cites></display><links><openurl>$$Topenurl_article</openurl><openurlfulltext>$$Topenurlfull_article</openurlfulltext><thumbnail>$$Tsyndetics_thumb_exl</thumbnail><linktohtml>$$Uhttps://ieeexplore.ieee.org/document/5590243$$EHTML$$P50$$Gieee$$H</linktohtml><link.rule.ids>314,780,784,27924,27925,54796</link.rule.ids><backlink>$$Uhttp://pascal-francis.inist.fr/vibad/index.php?action=getRecordDetail&amp;idt=24747505$$DView record in Pascal Francis$$Hfree_for_read</backlink></links><search><creatorcontrib>Levandoski, J. K.</creatorcontrib><creatorcontrib>Khalefa, M. E.</creatorcontrib><creatorcontrib>Mokbel, M. F.</creatorcontrib><title>On Producing High and Early Result Throughput in Multijoin Query Plans</title><title>IEEE transactions on knowledge and data engineering</title><addtitle>TKDE</addtitle><description>This paper introduces an efficient framework for producing high and early result throughput in multijoin query plans. While most previous research focuses on optimizing for cases involving a single join operator, this work takes a radical step by addressing query plans with multiple join operators. The proposed framework consists of two main methods, a flush algorithm and operator state manager. The framework assumes a symmetric hash join, a common method for producing early results, when processing incoming data. In this way, our methods can be applied to a group of previous join operators (optimized for single-join queries) when taking part in multijoin query plans. Specifically, our framework can be applied by 1) employing a new flushing policy to write in-memory data to disk, once memory allotment is exhausted, in a way that helps increase the probability of producing early result throughput in multijoin queries, and 2) employing a state manager that adaptively switches operators in the plan between joining in-memory data and disk-resident data in order to positively affect the early result throughput. Extensive experimental results show that the proposed methods outperform the state-of-the-art join operators optimized for both single and multijoin query plans.</description><subject>Algorithm design and analysis</subject><subject>Applied sciences</subject><subject>Computer science; control theory; systems</subject><subject>Database management</subject><subject>Database systems</subject><subject>Disks</subject><subject>Exact sciences and technology</subject><subject>Flushing</subject><subject>Information systems. Data bases</subject><subject>Memory organisation. Data processing</subject><subject>Operators</subject><subject>Optimization</subject><subject>Query processing</subject><subject>Radicals</subject><subject>Runtime</subject><subject>Software</subject><subject>Spatial databases</subject><subject>State of the art</subject><subject>Studies</subject><subject>Switches</subject><subject>systems</subject><issn>1041-4347</issn><issn>1558-2191</issn><fulltext>true</fulltext><rsrctype>article</rsrctype><creationdate>2011</creationdate><recordtype>article</recordtype><recordid>eNpdkEtPAjEUhRujiYguXblpTEzcDPY5bZdGQYwY0OC6KZ0ODBlmsJ0u-PeWQFi4uq_vnpwcAG4xGmCM1NP843U4IGg_SnIGephzmRGs8HnqEcMZo0xcgqsQ1gghKSTugdG0gTPfFtFWzRKOq-UKmqaAQ-PrHfx2IdYdnK98G5erbexg1cDPtKrWbeq-ovM7OKtNE67BRWnq4G6OtQ9-RsP5yzibTN_eX54nmaUy77Ky5IIQ5fJi4exCCcScKIl1zCCJkeCGK0sVT1dFpcylUQXiguKFM1ihgtE-eDzobn37G13o9KYK1tXJg2tj0JgyxQgmEif0_h-6bqNvkjutEkZyIfIEZQfI-jYE70q99dXG-J3GSO9D1ftQ9T5UnUJN_MNR1ARr6tKbxlbh9ESYYIIjnri7A1c5505nzhUijNI_xQt9xA</recordid><startdate>20111201</startdate><enddate>20111201</enddate><creator>Levandoski, J. K.</creator><creator>Khalefa, M. E.</creator><creator>Mokbel, M. F.</creator><general>IEEE</general><general>IEEE Computer Society</general><general>The Institute of Electrical and Electronics Engineers, Inc. (IEEE)</general><scope>97E</scope><scope>RIA</scope><scope>RIE</scope><scope>IQODW</scope><scope>AAYXX</scope><scope>CITATION</scope><scope>7SC</scope><scope>7SP</scope><scope>8FD</scope><scope>JQ2</scope><scope>L7M</scope><scope>L~C</scope><scope>L~D</scope><scope>F28</scope><scope>FR3</scope></search><sort><creationdate>20111201</creationdate><title>On Producing High and Early Result Throughput in Multijoin Query Plans</title><author>Levandoski, J. K. ; Khalefa, M. E. ; Mokbel, M. F.</author></sort><facets><frbrtype>5</frbrtype><frbrgroupid>cdi_FETCH-LOGICAL-c386t-ff57229e6dbecb9704e7f2ce4a081075a59c395dbe938868a9d05731bea190d43</frbrgroupid><rsrctype>articles</rsrctype><prefilter>articles</prefilter><language>eng</language><creationdate>2011</creationdate><topic>Algorithm design and analysis</topic><topic>Applied sciences</topic><topic>Computer science; control theory; systems</topic><topic>Database management</topic><topic>Database systems</topic><topic>Disks</topic><topic>Exact sciences and technology</topic><topic>Flushing</topic><topic>Information systems. Data bases</topic><topic>Memory organisation. Data processing</topic><topic>Operators</topic><topic>Optimization</topic><topic>Query processing</topic><topic>Radicals</topic><topic>Runtime</topic><topic>Software</topic><topic>Spatial databases</topic><topic>State of the art</topic><topic>Studies</topic><topic>Switches</topic><topic>systems</topic><toplevel>peer_reviewed</toplevel><toplevel>online_resources</toplevel><creatorcontrib>Levandoski, J. K.</creatorcontrib><creatorcontrib>Khalefa, M. E.</creatorcontrib><creatorcontrib>Mokbel, M. F.</creatorcontrib><collection>IEEE All-Society Periodicals Package (ASPP) 2005–Present</collection><collection>IEEE All-Society Periodicals Package (ASPP) 1998-Present</collection><collection>IEEE Xplore</collection><collection>Pascal-Francis</collection><collection>CrossRef</collection><collection>Computer and Information Systems Abstracts</collection><collection>Electronics &amp; Communications Abstracts</collection><collection>Technology Research Database</collection><collection>ProQuest Computer Science Collection</collection><collection>Advanced Technologies Database with Aerospace</collection><collection>Computer and Information Systems Abstracts – Academic</collection><collection>Computer and Information Systems Abstracts Professional</collection><collection>ANTE: Abstracts in New Technology &amp; Engineering</collection><collection>Engineering Research Database</collection><jtitle>IEEE transactions on knowledge and data engineering</jtitle></facets><delivery><delcategory>Remote Search Resource</delcategory><fulltext>fulltext</fulltext></delivery><addata><au>Levandoski, J. K.</au><au>Khalefa, M. E.</au><au>Mokbel, M. F.</au><format>journal</format><genre>article</genre><ristype>JOUR</ristype><atitle>On Producing High and Early Result Throughput in Multijoin Query Plans</atitle><jtitle>IEEE transactions on knowledge and data engineering</jtitle><stitle>TKDE</stitle><date>2011-12-01</date><risdate>2011</risdate><volume>23</volume><issue>12</issue><spage>1888</spage><epage>1902</epage><pages>1888-1902</pages><issn>1041-4347</issn><eissn>1558-2191</eissn><coden>ITKEEH</coden><abstract>This paper introduces an efficient framework for producing high and early result throughput in multijoin query plans. While most previous research focuses on optimizing for cases involving a single join operator, this work takes a radical step by addressing query plans with multiple join operators. The proposed framework consists of two main methods, a flush algorithm and operator state manager. The framework assumes a symmetric hash join, a common method for producing early results, when processing incoming data. In this way, our methods can be applied to a group of previous join operators (optimized for single-join queries) when taking part in multijoin query plans. Specifically, our framework can be applied by 1) employing a new flushing policy to write in-memory data to disk, once memory allotment is exhausted, in a way that helps increase the probability of producing early result throughput in multijoin queries, and 2) employing a state manager that adaptively switches operators in the plan between joining in-memory data and disk-resident data in order to positively affect the early result throughput. Extensive experimental results show that the proposed methods outperform the state-of-the-art join operators optimized for both single and multijoin query plans.</abstract><cop>New York, NY</cop><pub>IEEE</pub><doi>10.1109/TKDE.2010.182</doi><tpages>15</tpages><oa>free_for_read</oa></addata></record>
fulltext fulltext
identifier ISSN: 1041-4347
ispartof IEEE transactions on knowledge and data engineering, 2011-12, Vol.23 (12), p.1888-1902
issn 1041-4347
1558-2191
language eng
recordid cdi_ieee_primary_5590243
source IEEE Electronic Library (IEL) Journals
subjects Algorithm design and analysis
Applied sciences
Computer science
control theory
systems
Database management
Database systems
Disks
Exact sciences and technology
Flushing
Information systems. Data bases
Memory organisation. Data processing
Operators
Optimization
Query processing
Radicals
Runtime
Software
Spatial databases
State of the art
Studies
Switches
systems
title On Producing High and Early Result Throughput in Multijoin Query Plans
url http://sfxeu10.hosted.exlibrisgroup.com/loughborough?ctx_ver=Z39.88-2004&ctx_enc=info:ofi/enc:UTF-8&ctx_tim=2025-01-07T16%3A23%3A54IST&url_ver=Z39.88-2004&url_ctx_fmt=infofi/fmt:kev:mtx:ctx&rfr_id=info:sid/primo.exlibrisgroup.com:primo3-Article-proquest_ieee_&rft_val_fmt=info:ofi/fmt:kev:mtx:journal&rft.genre=article&rft.atitle=On%20Producing%20High%20and%20Early%20Result%20Throughput%20in%20Multijoin%20Query%20Plans&rft.jtitle=IEEE%20transactions%20on%20knowledge%20and%20data%20engineering&rft.au=Levandoski,%20J.%20K.&rft.date=2011-12-01&rft.volume=23&rft.issue=12&rft.spage=1888&rft.epage=1902&rft.pages=1888-1902&rft.issn=1041-4347&rft.eissn=1558-2191&rft.coden=ITKEEH&rft_id=info:doi/10.1109/TKDE.2010.182&rft_dat=%3Cproquest_ieee_%3E2552265381%3C/proquest_ieee_%3E%3Cgrp_id%3Ecdi_FETCH-LOGICAL-c386t-ff57229e6dbecb9704e7f2ce4a081075a59c395dbe938868a9d05731bea190d43%3C/grp_id%3E%3Coa%3E%3C/oa%3E%3Curl%3E%3C/url%3E&rft_id=info:oai/&rft_pqid=913426776&rft_id=info:pmid/&rft_ieee_id=5590243&rfr_iscdi=true