Loading…

EM in high-dimensional spaces

This paper considers fitting a mixture of Gaussians model to high-dimensional data in scenarios where there are fewer data samples than feature dimensions. Issues that arise when using principal component analysis (PCA) to represent Gaussian distributions inside Expectation-Maximization (EM) are add...

Full description

Saved in:

Bibliographic Details
Published in:	IEEE transactions on cybernetics 2005-06, Vol.35 (3), p.571-577
Main Authors:	Draper, B.A., Elliott, D.L., Hayes, J., Kyungim Baek
Format:	Article
Language:	English
Subjects:	Algorithms Artificial Intelligence Cluster Analysis Clustering algorithms Computer Simulation Covariance matrix Cybernetics Eigenvalues and eigenfunctions Expectation-Maximization Fittings Gaussian Gaussian distribution Gaussian processes image classification Image coding Image converters Image Enhancement - methods Image Interpretation, Computer-Assisted - methods Information Storage and Retrieval - methods Likelihood Functions Maximum likelihood estimation Models, Biological Models, Statistical Normal distribution Pattern Recognition, Automated - methods Pixel Principal Component Analysis Reproducibility of Results Sensitivity and Specificity Signal Processing, Computer-Assisted Subtraction Technique unsupervised learning
Citations:	Items that this one cites Items that cite this one
Online Access:	Get full text
Tags:	Add Tag No Tags, Be the first to tag this record!

Description
Summary:	This paper considers fitting a mixture of Gaussians model to high-dimensional data in scenarios where there are fewer data samples than feature dimensions. Issues that arise when using principal component analysis (PCA) to represent Gaussian distributions inside Expectation-Maximization (EM) are addressed, and a practical algorithm results. Unlike other algorithms that have been proposed, this algorithm does not try to compress the data to fit low-dimensional models. Instead, it models Gaussian distributions in the (N-1)-dimensional space spanned by the N data samples. We are able to show that this algorithm converges on data sets where low-dimensional techniques do not.
ISSN:	1083-4419 2168-2267 1941-0492 2168-2275
DOI:	10.1109/TSMCB.2005.846670