Loading…
Fast density peak clustering for large scale data based on kNN
Density Peak (DPeak) clustering algorithm is not applicable for large scale data, due to two quantities, i.e, ρ and δ, are both obtained by brute force algorithm with complexity O(n2). Thus, a simple but fast DPeak, namely FastDPeak,11https://github.com/XiaoLHu/FastDPeak?tdsourcetag=s_pcqq_aiomsg. i...
Saved in:
Published in: | Knowledge-based systems 2020-01, Vol.187, p.104824, Article 104824 |
---|---|
Main Authors: | , , , , , , , , , |
Format: | Article |
Language: | English |
Subjects: | |
Citations: | Items that this one cites Items that cite this one |
Online Access: | Get full text |
Tags: |
Add Tag
No Tags, Be the first to tag this record!
|
Summary: | Density Peak (DPeak) clustering algorithm is not applicable for large scale data, due to two quantities, i.e, ρ and δ, are both obtained by brute force algorithm with complexity O(n2). Thus, a simple but fast DPeak, namely FastDPeak,11https://github.com/XiaoLHu/FastDPeak?tdsourcetag=s_pcqq_aiomsg. is proposed, which runs in about O(nlog(n)) expected time in the intrinsic dimensionality. It replaces density with kNN-density, which is computed by fast kNN algorithm such as cover tree, yielding huge improvement for density computations. Based on kNN-density, local density peaks and non-local density peaks are identified, and a fast algorithm, which uses two different strategies to compute δ for them, is also proposed with complexity O(n). Experimental results show that FastDPeak is effective and outperforms other variants of DPeak. |
---|---|
ISSN: | 0950-7051 1872-7409 |
DOI: | 10.1016/j.knosys.2019.06.032 |