Loading…
Cluster validation in problems with increasing dimensionality and unbalanced clusters
Cluster validation methods provide measures to evaluate the quality of a clustering partition on a given data set, and to determine the correct number of clusters. Recently, a new set of validation techniques based on the clusters' negentropy has been introduced. Negentropy-based cluster valida...
Saved in:
Published in: | Neurocomputing (Amsterdam) 2014-01, Vol.123, p.33-39 |
---|---|
Main Authors: | , , , , |
Format: | Article |
Language: | English |
Subjects: | |
Citations: | Items that this one cites Items that cite this one |
Online Access: | Get full text |
Tags: |
Add Tag
No Tags, Be the first to tag this record!
|
Summary: | Cluster validation methods provide measures to evaluate the quality of a clustering partition on a given data set, and to determine the correct number of clusters. Recently, a new set of validation techniques based on the clusters' negentropy has been introduced. Negentropy-based cluster validation favors data partitions into compact clusters which are not strongly overlapped. Its evaluation is quite simple and it has been shown to perform better than other state of the art techniques. However, like many other cluster validation approaches, it presents problems when validating partitions where some regions contain only a few data points. Different heuristics have been proposed to cope with this problem, which are systematically analyzed in this paper. We study the performance of AIC, BIC, and four negentropy-based validation approaches in synthetic clustering problems of increasing dimensionality, with unbalanced clusters and different degree of overlapping. Our results suggest that negentropy-based validation techniques outperform AIC and BIC when the ratio of the number of points to the dimension is not high, which is a very common situation in most real applications. |
---|---|
ISSN: | 0925-2312 1872-8286 |
DOI: | 10.1016/j.neucom.2012.09.044 |