Loading…

Fitting document representation to specific datasets by adjusting membership functions

In this work we deal with the problem of web page clustering from the point of view of document representation. Fuzzy ruled-based systems have been successfully used to represent web documents by means of heuristic combinations of criteria. In these systems, rules were established based on the way h...

Full description

Saved in:
Bibliographic Details
Main Authors: Garcia-Plaza, A. P., Fresno, V., Martinez, R.
Format: Conference Proceeding
Language:English
Subjects:
Online Access:Request full text
Tags: Add Tag
No Tags, Be the first to tag this record!
Description
Summary:In this work we deal with the problem of web page clustering from the point of view of document representation. Fuzzy ruled-based systems have been successfully used to represent web documents by means of heuristic combinations of criteria. In these systems, rules were established based on the way humans read documents and have been analyzed in previous works. However, membership functions parameters were fixed by default, assuming that any document would follow similar patterns regardless of the rest of documents in the collection. In this work we analyze to what extent collection information could be used to adjust the membership functions in order to improve document representation, and therefore, clustering results. We compare our proposal to the original one in which is based, and to another similar or common approaches. We also perform statistical significance tests to ensure that our modifications have a real effect over the original representation. Results show that adjusting document representation parameters to concrete collections leads to better clustering results.
ISSN:1098-7584
DOI:10.1109/FUZZ-IEEE.2012.6251249