Loading…
Efficient mining of top-k high utility itemsets through genetic algorithms
Mining high utility itemsets is an emerging and very active research area in data mining. The goal is to mine all itemsets with a utility value, in terms of importance to the user, no less than a predefined threshold value. Setting an appropriate threshold value is not trivial, requiring not only mu...
Saved in:
Published in: | Information sciences 2023-05, Vol.624, p.529-553 |
---|---|
Main Authors: | , , , |
Format: | Article |
Language: | English |
Subjects: | |
Citations: | Items that this one cites Items that cite this one |
Online Access: | Get full text |
Tags: |
Add Tag
No Tags, Be the first to tag this record!
|
Summary: | Mining high utility itemsets is an emerging and very active research area in data mining. The goal is to mine all itemsets with a utility value, in terms of importance to the user, no less than a predefined threshold value. Setting an appropriate threshold value is not trivial, requiring not only multiple trials but also the know-how in the application field. The advantage of algorithms for mining top-k high utility itemsets is they do not require such a utility threshold, but they suffer from very long runtimes and large memory requirements when large input data is considered. We propose a new genetic algorithm for mining top-k high utility itemsets, named TKHUIM-GA (Top-K High Utility Itemset Mining through Genetic Algorithms). It guides the search process by considering the utility of each item to produce initial solutions and to combine solutions accordingly, reducing the runtime and memory consumption as a result. A highly efficient data representation is utilized to reduce memory usage and runtime. A key advantage of TKHUIM-GA is that it works on positive, negative, integer and real unit utility values unlike existing approaches. Experiments on popular benchmark datasets demonstrate the high performance of the proposal regarding the state-of-the-art algorithms. |
---|---|
ISSN: | 0020-0255 1872-6291 |
DOI: | 10.1016/j.ins.2022.12.092 |