Loading…

A Novel Short Text Clustering Model Based on Grey System Theory

Short text clustering has great challenges due to the structural reasons, especially when applied to small datasets. Limited number of words leads to a poor-quality feature vector, low clustering accuracy, and failure of analysis. Although some approaches have been observed in the related literature...

Full description

Saved in:
Bibliographic Details
Published in:Arabian journal for science and engineering (2011) 2020-04, Vol.45 (4), p.2865-2882
Main Authors: Fidan, Hüseyin, Yuksel, Mehmet Erkan
Format: Article
Language:English
Subjects:
Citations: Items that this one cites
Items that cite this one
Online Access:Get full text
Tags: Add Tag
No Tags, Be the first to tag this record!
Description
Summary:Short text clustering has great challenges due to the structural reasons, especially when applied to small datasets. Limited number of words leads to a poor-quality feature vector, low clustering accuracy, and failure of analysis. Although some approaches have been observed in the related literature, there is still no agreement on an efficient solution. On the other hand, the Grey system theory, which gives better results in numerical analyses with insufficient data, has not yet been applied to short text clustering. The purpose of our study is to develop a short text clustering model based on Grey system theory applicable to small datasets. In order to measure the efficiency of our method, book reviews labeled as negative or positive were obtained from Amazon.com dataset collections, and small datasets have been created. The Grey relational clustering as well as hierarchical and partitional algorithms has been applied to the small datasets separately. According to the results, our model has better accuracy values than the other algorithms in clustering of small datasets containing short text. Consequently, we demonstrated that the Grey relational clustering should be applied to short text clustering for much better results.
ISSN:2193-567X
1319-8025
2191-4281
DOI:10.1007/s13369-019-04191-0