Loading…
Keyword extraction by nonextensivity measure
The presence of a long-range correlation in the spatial distribution of a relevant word type, in spite of random occurrences of an irrelevant word type, is an important feature of human-written texts. We classify the correlation between the occurrences of words by nonextensive statistical mechanics...
Saved in:
Published in: | Physical review. E, Statistical, nonlinear, and soft matter physics Statistical, nonlinear, and soft matter physics, 2011-05, Vol.83 (5 Pt 2), p.056106-056106, Article 056106 |
---|---|
Main Authors: | , |
Format: | Article |
Language: | English |
Citations: | Items that this one cites Items that cite this one |
Online Access: | Get full text |
Tags: |
Add Tag
No Tags, Be the first to tag this record!
|
Summary: | The presence of a long-range correlation in the spatial distribution of a relevant word type, in spite of random occurrences of an irrelevant word type, is an important feature of human-written texts. We classify the correlation between the occurrences of words by nonextensive statistical mechanics for the word-ranking process. In particular, we look at the nonextensivity parameter as an alternative metric to measure the spatial correlation in the text, from which the words may be ranked in terms of this measure. Finally, we compare different methods for keyword extraction. |
---|---|
ISSN: | 1539-3755 1550-2376 |
DOI: | 10.1103/physreve.83.056106 |