Loading…
CorClustST—Correlation-based clustering of big spatio-temporal datasets
Increasing amounts of high-velocity spatio-temporal data reinforce the need for clustering algorithms which are effective for big data processing and data reduction. As currently applied spatio-temporal clustering algorithms have certain drawbacks regarding the comparability of the results, we propo...
Saved in:
Published in: | Future generation computer systems 2020-09, Vol.110, p.610-619 |
---|---|
Main Authors: | , , |
Format: | Article |
Language: | English |
Subjects: | |
Citations: | Items that this one cites Items that cite this one |
Online Access: | Get full text |
Tags: |
Add Tag
No Tags, Be the first to tag this record!
|
Summary: | Increasing amounts of high-velocity spatio-temporal data reinforce the need for clustering algorithms which are effective for big data processing and data reduction. As currently applied spatio-temporal clustering algorithms have certain drawbacks regarding the comparability of the results, we propose an alternative spatio-temporal clustering technique which is based on empirical spatial correlations over time. As a key feature, CorClustST makes it easily possible to compare and interpret clustering results for different scenarios such as multiple underlying variables or varying time frames. In a test case, we show that the clustering strategy successfully identifies increasing spatial correlations of wind power forecast errors in Europe for longer forecast horizons. An extension of the clustering algorithm is finally presented which allows for a large-scale parallel implementation and helps to circumvent memory limitations. The proposed method will especially be helpful for researchers who aim to preprocess big spatio-temporal datasets and who intend to compare clustering results and spatial dependencies for different scenarios.
•A new spatio-temporal clustering algorithm is proposed.•It utilizes spatial correlations over time to find meaningful clusters.•The clustering strategy is effective for big data processing and data reduction.•Its usefulness is demonstrated in a test case for wind power forecast errors.•An extension of the algorithm allows for a large-scale parallelization. |
---|---|
ISSN: | 0167-739X 1872-7115 |
DOI: | 10.1016/j.future.2018.04.002 |