Loading…

GMiner++: Boosting GPU-based frequent itemset mining by reducing redundant computations

Frequent itemset mining (FIM) is increasingly important in fundamental data mining techniques. However, the applicability of existing FIM methods is limited, mainly because of their performance. The expected performance improvement is limited owing to the exploitation of only a single thread, despit...

Full description

Saved in:
Bibliographic Details
Published in:Expert systems with applications 2024-09, Vol.250, p.123928, Article 123928
Main Authors: Chon, Kang-Wook, Kim, Chanki
Format: Article
Language:English
Subjects:
Citations: Items that this one cites
Online Access:Get full text
Tags: Add Tag
No Tags, Be the first to tag this record!
Description
Summary:Frequent itemset mining (FIM) is increasingly important in fundamental data mining techniques. However, the applicability of existing FIM methods is limited, mainly because of their performance. The expected performance improvement is limited owing to the exploitation of only a single thread, despite numerous efficient single-threaded FIM methods being proposed. Numerous parallel FIM methods have been devised using graphic processing units (GPU) or multicore central processing units (CPUs) to overcome the shortcomings of these methods. However, when extracting patterns from large amounts of data, multi-threaded FIM methods exhibit a similar performance tendency to single-threaded FIM methods, because of their large memory footprints and computations. Hence, we propose GMiner++, a memory-efficient GPU-based FIM method equipped with several GPUs. We propose a sub-database of the same size called bit array blocks, which contains pre-calculated bit arrays of F1∪P(IK). These bit arrays are repeatedly exploited during mining tasks using an elegant probabilistic model. GMiner++ can obtain frequent patterns and use several GPUs only by using the bit array blocks and the occurrence update scheme. The proposed method decreased redundant computations using pre-calculated bit arrays with bit array blocks. In addition, GMiner++ does not create intermediate data during mining tasks to increase robustness and reduce memory footprints. Simulation results demonstrate that GMiner++ outperformed existing FIM methods concerning performance and scalability with increasing robustness.
ISSN:0957-4174
DOI:10.1016/j.eswa.2024.123928