Loading…

Extracting Topological Features from Big Data Using Persistent Density Entropy

Topological data analysis is a method of extracting shape information of big data by means of algebraic topology in mathematics. Persistent homology is a very important method in topological data analysis. It constructs multi-scale simplicial complexes (also called filtration) to approximate the und...

Full description

Saved in:

Bibliographic Details
Published in:	Journal of physics. Conference series 2019-02, Vol.1168 (3), p.32017
Main Authors:	Xu, Jinzhong, Li, Xuzhi, Wang, Hongfei
Format:	Article
Language:	English
Subjects:	Big Data Data analysis Datasets Density Entropy Feature extraction Homology Outliers (statistics) Topology Uncertainty
Citations:	Items that this one cites
Online Access:	Get full text
Tags:	Add Tag No Tags, Be the first to tag this record!

Description
Summary:	Topological data analysis is a method of extracting shape information of big data by means of algebraic topology in mathematics. Persistent homology is a very important method in topological data analysis. It constructs multi-scale simplicial complexes (also called filtration) to approximate the underlying space of the data set. By studying these simplicial complexes, the topological features of each dimension of big data are summarized. However, it does not give us the uncertainty of each simplicial complex to approximate the underlying space of the data set. This paper defines an entropy called persistent density entropy, which gives the uncertainty of each simplicial complex approximating the underlying space. The examples demonstrate that it is able to find the best simplicial complex that approximates the underlying space and can be used to detect outliers to a certain extent.
ISSN:	1742-6588 1742-6596
DOI:	10.1088/1742-6596/1168/3/032017