Loading…
MR-MVPP: A map-reduce-based approach for creating MVPP in data warehouses for big data applications
Materialized view selection (MVS) is the problem of selecting an appropriate set of views to be materialized to speed up analytical query processing of data warehouses. Online analytical processing (OLAP) of queries is an essential application of the MVS problem, in which, the response times of the...
Saved in:
Published in: | Information sciences 2021-09, Vol.570, p.200-224 |
---|---|
Main Authors: | , |
Format: | Article |
Language: | English |
Subjects: | |
Citations: | Items that this one cites Items that cite this one |
Online Access: | Get full text |
Tags: |
Add Tag
No Tags, Be the first to tag this record!
|
Summary: | Materialized view selection (MVS) is the problem of selecting an appropriate set of views to be materialized to speed up analytical query processing of data warehouses. Online analytical processing (OLAP) of queries is an essential application of the MVS problem, in which, the response times of the queries are reduced by storing the selected views. Views are intermediate results of query processing and are selected in the MVS problem to be stored and will then be exploited in answering process of several queries. Views are usually organized as a view representation structure in the MVS problem. Multiple Views Processing Plan (MVPP) is a standard structure used for view representation in the MVS problem. Due to the tremendous amount of data, constructing the MVPP is a challenge in the big data applications. The MR-MVPP (Map-Reduce-based construction of the MVPP) is the proposed method of this paper to address this problem. The MR-MVPP performs a set similarity join (similarity-based join) on the base relations and views using the map-reduce model and the hashing technique. The MVPP construction time in the proposed method is reduced by avoiding redundant calculations in the process of creating the MVPP. The performance of the proposed method is empirically evaluated. According to the results of the experiments, the execution time of the MR-MVPP method is better than the other methods. The average time improvement is about 26.5 units. This improvement is better than the other similar researches in this area and is significant due to the high volume of data in real applications. Moreover, the proposed method works well in terms of the effectiveness of the created MVPP and has about a 50% coverage rate for view selection methods. Deterministic methods are more accurate than hashing methods and can be utilized for set similarity join as future work to probably improve the effectiveness of the constructed MVPP. |
---|---|
ISSN: | 0020-0255 1872-6291 |
DOI: | 10.1016/j.ins.2021.04.004 |