Loading…

An empirical evaluation of exact set similarity join techniques using GPUs

Exact set similarity join is a notoriously expensive operation, for which several solutions have been proposed. Recently, there have been studies that present a comparative analysis using MapReduce or a non-parallel setting. Our contribution is that we complement these works through conducting a tho...

Full description

Saved in:
Bibliographic Details
Published in:Information systems (Oxford) 2020-03, Vol.89, p.101485, Article 101485
Main Authors: Bellas, Christos, Gounaris, Anastasios
Format: Article
Language:English
Subjects:
Citations: Items that this one cites
Items that cite this one
Online Access:Get full text
Tags: Add Tag
No Tags, Be the first to tag this record!
Description
Summary:Exact set similarity join is a notoriously expensive operation, for which several solutions have been proposed. Recently, there have been studies that present a comparative analysis using MapReduce or a non-parallel setting. Our contribution is that we complement these works through conducting a thorough evaluation of the state-of-the-art GPU-enabled techniques. These techniques are highly diverse in their key features and our experiments manage to reveal the key strengths of each one. As we explain, in real-life applications there is no dominant solution. Depending on specific dataset and query characteristics, each solution, even not using the GPU at all, has its own sweet spot. All our work is repeatable and extensible. •A thorough evaluation showing the sweet spot of each different technique for exact set similarity joins using a GPU.•In large threshold values the sequential CPU techniques are competitive.•In lower threshold values, employing parallel GPU techniques seems beneficial.•Overall, GPU techniques may perform worse due to the imposed quadratic space overhead.•A CPU-GPU co-process scheme performs better in some cases due to efficient workload balance.
ISSN:0306-4379
1873-6076
DOI:10.1016/j.is.2019.101485