Loading…

k-NN Sampling for Visualization of Dynamic Data Using LION-tSNE

Dimensionality reduction algorithms are often used to visualize multi-dimensional data, which are mostly non-parametric. Non-parametric methods do not provide any explicit intuition for adding new data points into an existing environment which limits the applicability of visualization for Big Data s...

Full description

Saved in:
Bibliographic Details
Main Authors: Dharamsotu, Bheekya, Rani, K. Swarupa, Abdul Moiz, Salman, Rao, C. Raghavendra
Format: Conference Proceeding
Language:English
Subjects:
Online Access:Request full text
Tags: Add Tag
No Tags, Be the first to tag this record!
Description
Summary:Dimensionality reduction algorithms are often used to visualize multi-dimensional data, which are mostly non-parametric. Non-parametric methods do not provide any explicit intuition for adding new data points into an existing environment which limits the applicability of visualization for Big Data scenario. The LION-tSNE (Local Interpolation with Outlier coNtrol t-Distributed Stochastic Neighbor Embedding) method was proposed to overcome the limitations of existing techniques. The LION-tSNE algorithm uses random sampling method for tSNE model design which creates an initial visual environment then new data points are added to this environment using local-IDW(Inverse Distance Weighting) interpolation method. The randomly selected sample data often suffer from non-representativeness of the whole data which creates inconsistency in the tSNE environment. To overcome this problem two new sampling methods are proposed which are based on k-NN (k-Nearest Neighbor) graph update properties. It is empirically shown that proposed methods outperform existing LION-tSNE method with 0.5 to 2% more k-NN accuracy and results are more consistent. The study is done on five differently characterized datasets with three different initial solutions of tSNE. The proposed method results are statistically significant which is done by statistical method pairwise t-test.
ISSN:2640-0316
DOI:10.1109/HiPC.2019.00019