Loading…

Dynamic time warping-based imputation for univariate time series data

•An effective approach to fill large gap(s) in univariate time series is proposed.•Apply DTW algorithm to find similar pattern(s) to the sub-sequence before the gap.•Combine shape-features extraction and DTW methods to reduce insignificant solutions.•Quantitative and visual performance of 6 methods...

Full description

Saved in:
Bibliographic Details
Published in:Pattern recognition letters 2020-11, Vol.139, p.139-147
Main Authors: Phan, Thi-Thu-Hong, Poisson Caillault, Émilie, Lefebvre, Alain, Bigand, André
Format: Article
Language:English
Subjects:
Citations: Items that this one cites
Items that cite this one
Online Access:Get full text
Tags: Add Tag
No Tags, Be the first to tag this record!
Description
Summary:•An effective approach to fill large gap(s) in univariate time series is proposed.•Apply DTW algorithm to find similar pattern(s) to the sub-sequence before the gap.•Combine shape-features extraction and DTW methods to reduce insignificant solutions.•Quantitative and visual performance of 6 methods are compared on 8 time series.•We implemented all methods using R language. Time series with missing values occur in almost any domain of applied sciences. Ignoring missing values can lead to a loss of efficiency and unreliable results, especially for large missing sub-sequence(s). This paper proposes an approach to fill in large gap(s) within time series data under the assumption of effective information. To obtain the imputation of missing values, we find the most similar sub-sequence to the sub-sequence before (resp. after) the missing values, then complete the gap by the next (resp. previous) sub-sequence of the most similar one. Dynamic Time Warping algorithm is applied to compare sub-sequences, and combined with the shape-feature extraction algorithm for reducing insignificant solutions. Eight well-known and real-world data sets are used for evaluating the performance of the proposed approach in comparison with five other methods on different indicators. The obtained results proved that the performance of our approach is the most robust one in case of time series data having high auto-correlation and cross-correlation, strong seasonality, large gap(s), and complex distribution.
ISSN:0167-8655
1872-7344
DOI:10.1016/j.patrec.2017.08.019