Loading…

Efficiency Analysis of the access method with the cascading Bloom filter to the data warehouse on the parallel computing platform

A new method was developed with a cascading Bloom filter (CBF) for executing SQL queries in the Apache Spark parallel computing environment. It includes the representation of the original query in the form of several subqueries, the development of a connection graph and the transformation of subquer...

Full description

Saved in:
Bibliographic Details
Published in:Journal of physics. Conference series 2017-10, Vol.913 (1), p.12011
Main Authors: Grigoriev, Yu A, Proletarskaya, V A, Ermakov, E Yu, Ermakov, O Yu
Format: Article
Language:English
Subjects:
Citations: Items that this one cites
Items that cite this one
Online Access:Get full text
Tags: Add Tag
No Tags, Be the first to tag this record!
Description
Summary:A new method was developed with a cascading Bloom filter (CBF) for executing SQL queries in the Apache Spark parallel computing environment. It includes the representation of the original query in the form of several subqueries, the development of a connection graph and the transformation of subqueries, the definition of connections where it is necessary to use Bloom filters, the representation of the graph in terms of Spark. On the example of the query Q3 of the TPC-H test, full-scale experiments were carried out, which confirmed the effectiveness of the developed method.
ISSN:1742-6588
1742-6596
DOI:10.1088/1742-6596/913/1/012011