Loading…

Interpretation of Structural Preservation in Low-Dimensional Embeddings

Despite being commonly used in big-data analytics; the outcome of dimensionality reduction remains a black-box to most of its users. Understanding the quality of a low-dimensional embedding is important as not only it enables trust in the transformed data, but it can also help to select the most app...

Full description

Saved in:
Bibliographic Details
Published in:IEEE transactions on knowledge and data engineering 2022-05, Vol.34 (5), p.2227-2240
Main Authors: Ghosh, Aindrila, Nashaat, Mona, Miller, James, Quader, Shaikh
Format: Article
Language:English
Subjects:
Citations: Items that this one cites
Items that cite this one
Online Access:Get full text
Tags: Add Tag
No Tags, Be the first to tag this record!
Description
Summary:Despite being commonly used in big-data analytics; the outcome of dimensionality reduction remains a black-box to most of its users. Understanding the quality of a low-dimensional embedding is important as not only it enables trust in the transformed data, but it can also help to select the most appropriate dimensionality reduction algorithm in a given scenario. As existing research primarily focuses on the visual exploration of embeddings, there is still a need for enhancing interpretability of such algorithms. To bridge this gap, we propose two novel interactive explanation techniques for low-dimensional embeddings obtained from any dimensionality reduction algorithm. The first technique LAPS produces a local approximation of the neighborhood structure to generate interpretable explanations on the preserved locality for a single instance. The second method GAPS explains the retained global structure of a high-dimensional dataset in its embedding, by combining non-redundant local-approximations from a coarse discretization of the projection space. We demonstrate the applicability of the proposed techniques using 16 real-life tabular, text, image, and audio datasets. Our extensive experimental evaluation shows the utility of the proposed techniques in interpreting the quality of low-dimensional embeddings, as well as with selecting the most suitable dimensionality reduction algorithm for any given dataset.
ISSN:1041-4347
1558-2191
DOI:10.1109/TKDE.2020.3005878