Loading…

Clustering subgaussian mixtures by semidefinite programming

Abstract We introduce a model-free relax-and-round algorithm for k-means clustering based on a semidefinite relaxation due to Peng and Wei (2007, SIAM J. Optim., 18, 186–205). The algorithm interprets the output of the semidefinite program as a denoised version of the original data and then rounds t...

Full description

Saved in:
Bibliographic Details
Published in:Information and inference 2017-12, Vol.6 (4), p.389-415
Main Authors: Mixon, Dustin G, Villar, Soledad, Ward, Rachel
Format: Article
Language:English
Citations: Items that this one cites
Items that cite this one
Online Access:Get full text
Tags: Add Tag
No Tags, Be the first to tag this record!
Description
Summary:Abstract We introduce a model-free relax-and-round algorithm for k-means clustering based on a semidefinite relaxation due to Peng and Wei (2007, SIAM J. Optim., 18, 186–205). The algorithm interprets the output of the semidefinite program as a denoised version of the original data and then rounds this output to a hard clustering. We provide a generic method for proving performance guarantees for this algorithm, and we analyse the algorithm in the context of subgaussian mixture models. We also study the fundamental limits of estimating Gaussian centers by k-means clustering to compare our approximation guarantee to the theoretically optimal k-means clustering solution.
ISSN:2049-8764
2049-8772
DOI:10.1093/imaiai/iax001