Loading…

A skip-list approach for efficiently processing forecasting queries

Time series data is common in many settings including scientific and financial applications. In these applications, the amount of data is often very large. We seek to support prediction queries over time series data. Prediction relies on model building which can be too expensive to be practical if i...

Full description

Saved in:
Bibliographic Details
Published in:Proceedings of the VLDB Endowment 2008-08, Vol.1 (1), p.984-995
Main Authors: Ge, Tingjian, Zdonik, Stan
Format: Article
Language:English
Citations: Items that this one cites
Items that cite this one
Online Access:Get full text
Tags: Add Tag
No Tags, Be the first to tag this record!
Description
Summary:Time series data is common in many settings including scientific and financial applications. In these applications, the amount of data is often very large. We seek to support prediction queries over time series data. Prediction relies on model building which can be too expensive to be practical if it is based on a large number of data points. We propose to use statistical tests of hypotheses to choose a proper subset of data points to use for a given prediction query interval. This involves two steps: choosing a proper history length and choosing the number of data points to use within this history. Further, we use an I/O conscious skip list data structure to provide samples of the original data set. Based on the statistics collected for a query workload, which we model as a probability mass function (PMF) over query intervals, we devise a randomized algorithm that selects a set of pre-built models (PM's) to construct, subject to some maintenance cost constraint when there are updates. Given this set of PM's, we discuss interesting query processing strategies for not only point queries, but also range, aggregation, and JOIN queries. We conduct a comprehensive empirical study on real world datasets to verify the effectiveness of our approaches and algorithms.
ISSN:2150-8097
2150-8097
DOI:10.14778/1453856.1453962