Loading…
Promoting Total Efficiency in Text Clustering via Iterative and Interactive Metric Learning
In this paper, we propose a framework to make the text clustering process, as a whole, efficient. In a real text clustering task, an analyst usually has some expectation on the results in mind. However, a single run of a clustering algorithm on the preprocessed data would not satisfy the expectation...
Saved in:
Main Authors: | , , |
---|---|
Format: | Conference Proceeding |
Language: | English |
Subjects: | |
Online Access: | Request full text |
Tags: |
Add Tag
No Tags, Be the first to tag this record!
|
Summary: | In this paper, we propose a framework to make the text clustering process, as a whole, efficient. In a real text clustering task, an analyst usually has some expectation on the results in mind. However, a single run of a clustering algorithm on the preprocessed data would not satisfy the expectation. Then the analyst faces labor-intensive trials for improving the results that involve repetitive feature refinement and parameter tuning. We develop the Iterative and Interactive Metric Learning System (IIMLS) for addressing the challenge. Specifically, IIMLS allows analysts to input feedback on a current clustering result. Given the feedback, IIMLS optimizes metric in the feature space so that the clustering algorithm applied with the refined metric would reflect the feedback. As a byproduct, learned metric may be used for a similar dataset. Illustrative examples on a real-world dataset show IIMLS can dramatically improve efficiency of a text clustering task. The learned ¿knowledge¿, or the metric, is visualized for gaining insights of the optimized feature metric. |
---|---|
ISSN: | 1550-4786 2374-8486 |
DOI: | 10.1109/ICDM.2009.124 |