Loading…

CorClustST—Correlation-based clustering of big spatio-temporal datasets

Increasing amounts of high-velocity spatio-temporal data reinforce the need for clustering algorithms which are effective for big data processing and data reduction. As currently applied spatio-temporal clustering algorithms have certain drawbacks regarding the comparability of the results, we propo...

Full description

Saved in:
Bibliographic Details
Published in:Future generation computer systems 2020-09, Vol.110, p.610-619
Main Authors: Hüsch, Marc, Schyska, Bruno U., von Bremen, Lueder
Format: Article
Language:English
Subjects:
Citations: Items that this one cites
Items that cite this one
Online Access:Get full text
Tags: Add Tag
No Tags, Be the first to tag this record!
Description
Summary:Increasing amounts of high-velocity spatio-temporal data reinforce the need for clustering algorithms which are effective for big data processing and data reduction. As currently applied spatio-temporal clustering algorithms have certain drawbacks regarding the comparability of the results, we propose an alternative spatio-temporal clustering technique which is based on empirical spatial correlations over time. As a key feature, CorClustST makes it easily possible to compare and interpret clustering results for different scenarios such as multiple underlying variables or varying time frames. In a test case, we show that the clustering strategy successfully identifies increasing spatial correlations of wind power forecast errors in Europe for longer forecast horizons. An extension of the clustering algorithm is finally presented which allows for a large-scale parallel implementation and helps to circumvent memory limitations. The proposed method will especially be helpful for researchers who aim to preprocess big spatio-temporal datasets and who intend to compare clustering results and spatial dependencies for different scenarios. •A new spatio-temporal clustering algorithm is proposed.•It utilizes spatial correlations over time to find meaningful clusters.•The clustering strategy is effective for big data processing and data reduction.•Its usefulness is demonstrated in a test case for wind power forecast errors.•An extension of the algorithm allows for a large-scale parallelization.
ISSN:0167-739X
1872-7115
DOI:10.1016/j.future.2018.04.002