Loading…

Global and local structure preserving GPU t-SNE methods for large-scale applications

Currently, the use of dimensionality reduction techniques such as t-distributed stochastic neighbor embedding (t-SNE) to visualize data has become essential in dealing with large-scale datasets. The state-of-the-art t-SNE-based techniques rely on a variety of methods to take advantage of GPU paralle...

Full description

Saved in:
Bibliographic Details
Published in:Expert systems with applications 2022-09, Vol.201, p.116918, Article 116918
Main Authors: Meyer, Bruno Henrique, Pozo, Aurora Trinidad Ramirez, Nunan Zola, Wagner M.
Format: Article
Language:English
Subjects:
Citations: Items that this one cites
Items that cite this one
Online Access:Get full text
Tags: Add Tag
No Tags, Be the first to tag this record!
Description
Summary:Currently, the use of dimensionality reduction techniques such as t-distributed stochastic neighbor embedding (t-SNE) to visualize data has become essential in dealing with large-scale datasets. The state-of-the-art t-SNE-based techniques rely on a variety of methods to take advantage of GPU parallelism. The major contributions of this work consist of a new approach named simulated wide-warp anchor t-SNE (SWW-AtSNE) that combines the SWW-tSNE technique with the anchor t-SNE (AtSNE) approach, which has better preservation of global structures than SWW-tSNE and a faster execution time than AtSNE. The preservation of global structures was measured with a new metric called medium neighborhood preservation (MNP). We also propose and study the adaptations of the technique simulated wide-warp t-SNE (SWW-tSNE). The adaptations consist of using a preprocessing technique or changing the initialization method using principal component analysis (PCA). The proposal of SWW-AtSNE and the adaptations of SWW-tSNE also include the possibility of performing dimensionality reduction in two dimensions in addition to three dimensions. Furthermore, this research compares different t-SNE-based techniques using large-scale datasets. Two essential criteria are used in the comparisons: the preservation of global and local structures. Moreover, this paper compares seven methods through two AI applications: reinforcement learning and generative adversarial networks (GANs). The experimental results show that strategies such as the AtSNE method could improve dimensionality reduction quality, considering the preservation of global structures. However, it cannot achieve better results than other approaches, such as using principal component analysis in the initialization of t-SNE. Nevertheless, the ideas of both methods could be merged into a unique technique in future studies. •Fast SWW-AtSNE method for dimensionality reduction preserves Global/Local structures.•The introduction of a new metric to quantify the global structure preservation.•Analysis of GPU t-SNE based methods in real-world applications with large datasets.•PCA initialization in SWW-tSNE is fundamental to preserve global structures.•UMAP and AtSNE does not preserve global structures better than SWW-tSNE.
ISSN:0957-4174
1873-6793
DOI:10.1016/j.eswa.2022.116918