Loading…
FAR-HD: A fast and efficient algorithm for mining fuzzy association rules in large high-dimensional datasets
Fuzzy Association Rule Mining (ARM) has been extensively used in relational or transactional datasets having less-to-medium number of attributes/dimensions. The mined fuzzy association rules (patterns) are not only used for manual analysis by domain experts, but are also leveraged to drive further m...
Saved in:
Main Authors: | , |
---|---|
Format: | Conference Proceeding |
Language: | English |
Subjects: | |
Online Access: | Request full text |
Tags: |
Add Tag
No Tags, Be the first to tag this record!
|
Summary: | Fuzzy Association Rule Mining (ARM) has been extensively used in relational or transactional datasets having less-to-medium number of attributes/dimensions. The mined fuzzy association rules (patterns) are not only used for manual analysis by domain experts, but are also leveraged to drive further mining tasks like classification and clustering which automate decision-making. Such fuzzy association rules can also be derived from high-dimensional numerical datasets, like image datasets, in order to train fuzzy associative classifiers or clustering algorithms. Traditional Fuzzy ARM algorithms are not able to mine rules from them efficiently, since such algorithms are meant to deal with datasets with relatively much less number of attributes/dimensions. Hence, in this paper we propose FAR-HD which is a Fuzzy ARM algorithm designed specifically for large high-dimensional datasets. FAR-HD processes fuzzy frequent itemsets in a DFS manner using a two-phased multiple-partition tidlist-based strategy. It also uses a byte-vector representation of tidlists, with the tidlists stored in the main memory in a compressed form (using a fast generic compression method). Additionally, FAR-HD uses Fuzzy Clustering to convert each numerical vector of the original input dataset to a fuzzy-cluster-based representation, which is ultimately used for the actual Fuzzy ARM process. FAR-HD has been compared experimentally with Fuzzy Apriori (7-15 times faster), which is the most popular Fuzzy ARM algorithm, and a Fuzzy ARM algorithm (1.1-4 times faster) which we proposed earlier and which is designed to work with very large but traditional (with fewer attributes) datasets. |
---|---|
ISSN: | 1098-7584 |
DOI: | 10.1109/FUZZ-IEEE.2013.6622333 |