Loading…
Fitting document representation to specific datasets by adjusting membership functions
In this work we deal with the problem of web page clustering from the point of view of document representation. Fuzzy ruled-based systems have been successfully used to represent web documents by means of heuristic combinations of criteria. In these systems, rules were established based on the way h...
Saved in:
Main Authors: | , , |
---|---|
Format: | Conference Proceeding |
Language: | English |
Subjects: | |
Online Access: | Request full text |
Tags: |
Add Tag
No Tags, Be the first to tag this record!
|
Summary: | In this work we deal with the problem of web page clustering from the point of view of document representation. Fuzzy ruled-based systems have been successfully used to represent web documents by means of heuristic combinations of criteria. In these systems, rules were established based on the way humans read documents and have been analyzed in previous works. However, membership functions parameters were fixed by default, assuming that any document would follow similar patterns regardless of the rest of documents in the collection. In this work we analyze to what extent collection information could be used to adjust the membership functions in order to improve document representation, and therefore, clustering results. We compare our proposal to the original one in which is based, and to another similar or common approaches. We also perform statistical significance tests to ensure that our modifications have a real effect over the original representation. Results show that adjusting document representation parameters to concrete collections leads to better clustering results. |
---|---|
ISSN: | 1098-7584 |
DOI: | 10.1109/FUZZ-IEEE.2012.6251249 |