Loading…
GMiner++: Boosting GPU-based frequent itemset mining by reducing redundant computations
Frequent itemset mining (FIM) is increasingly important in fundamental data mining techniques. However, the applicability of existing FIM methods is limited, mainly because of their performance. The expected performance improvement is limited owing to the exploitation of only a single thread, despit...
Saved in:
Published in: | Expert systems with applications 2024-09, Vol.250, p.123928, Article 123928 |
---|---|
Main Authors: | , |
Format: | Article |
Language: | English |
Subjects: | |
Citations: | Items that this one cites |
Online Access: | Get full text |
Tags: |
Add Tag
No Tags, Be the first to tag this record!
|
Summary: | Frequent itemset mining (FIM) is increasingly important in fundamental data mining techniques. However, the applicability of existing FIM methods is limited, mainly because of their performance. The expected performance improvement is limited owing to the exploitation of only a single thread, despite numerous efficient single-threaded FIM methods being proposed. Numerous parallel FIM methods have been devised using graphic processing units (GPU) or multicore central processing units (CPUs) to overcome the shortcomings of these methods. However, when extracting patterns from large amounts of data, multi-threaded FIM methods exhibit a similar performance tendency to single-threaded FIM methods, because of their large memory footprints and computations. Hence, we propose GMiner++, a memory-efficient GPU-based FIM method equipped with several GPUs. We propose a sub-database of the same size called bit array blocks, which contains pre-calculated bit arrays of F1∪P(IK). These bit arrays are repeatedly exploited during mining tasks using an elegant probabilistic model. GMiner++ can obtain frequent patterns and use several GPUs only by using the bit array blocks and the occurrence update scheme. The proposed method decreased redundant computations using pre-calculated bit arrays with bit array blocks. In addition, GMiner++ does not create intermediate data during mining tasks to increase robustness and reduce memory footprints. Simulation results demonstrate that GMiner++ outperformed existing FIM methods concerning performance and scalability with increasing robustness. |
---|---|
ISSN: | 0957-4174 |
DOI: | 10.1016/j.eswa.2024.123928 |