Loading…

Selection of K in K-means clustering

Abstract The K-means algorithm is a popular data-clustering algorithm. However, one of its drawbacks is the requirement for the number of clusters, K, to be specified before the algorithm is applied. This paper first reviews existing methods for selecting the number of clusters for the algorithm. Fa...

Full description

Saved in:
Bibliographic Details
Published in:Proceedings of the Institution of Mechanical Engineers. Part C, Journal of mechanical engineering science Journal of mechanical engineering science, 2005-01, Vol.219 (1), p.103-119
Main Authors: Pham, D T, Dimov, S S, Nguyen, C D
Format: Article
Language:English
Citations: Items that this one cites
Items that cite this one
Online Access:Get full text
Tags: Add Tag
No Tags, Be the first to tag this record!
Description
Summary:Abstract The K-means algorithm is a popular data-clustering algorithm. However, one of its drawbacks is the requirement for the number of clusters, K, to be specified before the algorithm is applied. This paper first reviews existing methods for selecting the number of clusters for the algorithm. Factors that affect this selection are then discussed and a new measure to assist the selection is proposed. The paper concludes with an analysis of the results of using the proposed measure to determine the number of clusters for the K-means algorithm for different data sets.
ISSN:0954-4062
2041-2983
DOI:10.1243/095440605X8298