Loading…
Semantic multi-grain mixture topic model for text analysis
► General degree of topics can be quantized by topic granularity. ► DCT provides a mechanism for computing semantic topic granularity. ► A mixture semantic topic model is proposed to describe the multi-grain topics. Granular topic extraction and modeling are fundament tasks in text analysis. Hierarc...
Saved in:
Published in: | Expert systems with applications 2011-04, Vol.38 (4), p.3574-3579 |
---|---|
Main Authors: | , , , |
Format: | Article |
Language: | English |
Subjects: | |
Citations: | Items that this one cites Items that cite this one |
Online Access: | Get full text |
Tags: |
Add Tag
No Tags, Be the first to tag this record!
|
Summary: | ► General degree of topics can be quantized by topic granularity. ► DCT provides a mechanism for computing semantic topic granularity. ► A mixture semantic topic model is proposed to describe the multi-grain topics.
Granular topic extraction and modeling are fundament tasks in text analysis. Hierarchical topic clustering algorithms and hierarchical topic models are usually employed for these purposes. However, it is difficult to make a clear distinguish between each pair of hierarchical topics from the semantic granularity point of view. STG (semantic topic granularity) is proposed to indicate the details degree of topic description, and aim at providing discrimination for topics from semantic aspect. A new model, mgMTM (multi-grain mixture topic model) based on STG is then proposed to model grain topics. DCT (discrete cosine transform) is employed to provide a mechanism for computing STG, extracting grain topics and learning mgMTM. Experiments on real world datasets show that the proposed model has lower perplexity score than that of LDA model and thus has better generalization performance in describing text. Experiments also show that the description of the extracted grain topics can be well explained with respect to a dataset including topics about recent global financial crisis. |
---|---|
ISSN: | 0957-4174 1873-6793 |
DOI: | 10.1016/j.eswa.2010.08.146 |