Loading…

Fast density peak clustering for large scale data based on kNN

Density Peak (DPeak) clustering algorithm is not applicable for large scale data, due to two quantities, i.e, ρ and δ, are both obtained by brute force algorithm with complexity O(n2). Thus, a simple but fast DPeak, namely FastDPeak,11https://github.com/XiaoLHu/FastDPeak?tdsourcetag=s_pcqq_aiomsg. i...

Full description

Saved in:

Bibliographic Details
Published in:	Knowledge-based systems 2020-01, Vol.187, p.104824, Article 104824
Main Authors:	Chen, Yewang, Hu, Xiaoliang, Fan, Wentao, Shen, Lianlian, Zhang, Zheng, Liu, Xin, Du, Jixiang, Li, Haibo, Chen, Yi, Li, Hailin
Format:	Article
Language:	English
Subjects:	Algorithms Clustering Complexity Density Density peak FastDPeak kNN-density
Citations:	Items that this one cites Items that cite this one
Online Access:	Get full text
Tags:	Add Tag No Tags, Be the first to tag this record!

Description
Summary:	Density Peak (DPeak) clustering algorithm is not applicable for large scale data, due to two quantities, i.e, ρ and δ, are both obtained by brute force algorithm with complexity O(n2). Thus, a simple but fast DPeak, namely FastDPeak,11https://github.com/XiaoLHu/FastDPeak?tdsourcetag=s_pcqq_aiomsg. is proposed, which runs in about O(nlog(n)) expected time in the intrinsic dimensionality. It replaces density with kNN-density, which is computed by fast kNN algorithm such as cover tree, yielding huge improvement for density computations. Based on kNN-density, local density peaks and non-local density peaks are identified, and a fast algorithm, which uses two different strategies to compute δ for them, is also proposed with complexity O(n). Experimental results show that FastDPeak is effective and outperforms other variants of DPeak.
ISSN:	0950-7051 1872-7409
DOI:	10.1016/j.knosys.2019.06.032