Loading…

Tree preserving embedding

The goal of dimensionality reduction is to embed high-dimensional data in a low-dimensional space while preserving structure in the data relevant to exploratory data analysis such as clusters. However, existing dimensionality reduction methods often either fail to separate clusters due to the crowdi...

Full description

Saved in:
Bibliographic Details
Published in:Proceedings of the National Academy of Sciences - PNAS 2011-10, Vol.108 (41), p.16916-16921
Main Authors: Shieh, Albert D, Hashimoto, Tatsunori B, Airoldi, Edoardo M
Format: Article
Language:English
Subjects:
Citations: Items that this one cites
Items that cite this one
Online Access:Get full text
Tags: Add Tag
No Tags, Be the first to tag this record!
Description
Summary:The goal of dimensionality reduction is to embed high-dimensional data in a low-dimensional space while preserving structure in the data relevant to exploratory data analysis such as clusters. However, existing dimensionality reduction methods often either fail to separate clusters due to the crowding problem or can only separate clusters at a single resolution. We develop a new approach to dimensionality reduction: tree preserving embedding. Our approach uses the topological notion of connectedness to separate clusters at all resolutions. We provide a formal guarantee of cluster separation for our approach that holds for finite samples. Our approach requires no parameters and can handle general types of data, making it easy to use in practice and suggesting new strategies for robust data visualization.
ISSN:0027-8424
1091-6490
DOI:10.1073/pnas.1018393108