Loading…
An empirical evaluation of exact set similarity join techniques using GPUs
Exact set similarity join is a notoriously expensive operation, for which several solutions have been proposed. Recently, there have been studies that present a comparative analysis using MapReduce or a non-parallel setting. Our contribution is that we complement these works through conducting a tho...
Saved in:
Published in: | Information systems (Oxford) 2020-03, Vol.89, p.101485, Article 101485 |
---|---|
Main Authors: | , |
Format: | Article |
Language: | English |
Subjects: | |
Citations: | Items that this one cites Items that cite this one |
Online Access: | Get full text |
Tags: |
Add Tag
No Tags, Be the first to tag this record!
|
Summary: | Exact set similarity join is a notoriously expensive operation, for which several solutions have been proposed. Recently, there have been studies that present a comparative analysis using MapReduce or a non-parallel setting. Our contribution is that we complement these works through conducting a thorough evaluation of the state-of-the-art GPU-enabled techniques. These techniques are highly diverse in their key features and our experiments manage to reveal the key strengths of each one. As we explain, in real-life applications there is no dominant solution. Depending on specific dataset and query characteristics, each solution, even not using the GPU at all, has its own sweet spot. All our work is repeatable and extensible.
•A thorough evaluation showing the sweet spot of each different technique for exact set similarity joins using a GPU.•In large threshold values the sequential CPU techniques are competitive.•In lower threshold values, employing parallel GPU techniques seems beneficial.•Overall, GPU techniques may perform worse due to the imposed quadratic space overhead.•A CPU-GPU co-process scheme performs better in some cases due to efficient workload balance. |
---|---|
ISSN: | 0306-4379 1873-6076 |
DOI: | 10.1016/j.is.2019.101485 |