Loading…

Ares: Automatic Disaggregation of Historical Data

We address the challenge of reconstructing historical counts from aggregated, possibly overlapping historical reports. For example, given the monthly and weekly sums, how can we find the daily counts of people infected with flu? We propose an approach, called ARES (Automatic REStoration), that perfo...

Full description

Saved in:
Bibliographic Details
Main Authors: Fan Yang, Hyun Ah Song, Zongge Liu, Faloutsos, Christos, Zadorozhny, Vladimir, Sidiropoulos, Nicholas
Format: Conference Proceeding
Language:English
Subjects:
Online Access:Request full text
Tags: Add Tag
No Tags, Be the first to tag this record!
Description
Summary:We address the challenge of reconstructing historical counts from aggregated, possibly overlapping historical reports. For example, given the monthly and weekly sums, how can we find the daily counts of people infected with flu? We propose an approach, called ARES (Automatic REStoration), that performs automatic data reconstruction in two phases: (1) first, it estimates the sequence of historical counts utilizing domain knowledge, such as smoothness and periodicity of historical events; (2) then, it uses the estimated sequence to learn notable patterns in the target sequence to refine the reconstructed time series. In order to derive such patterns, ARES uses an annihilating filter technique. The idea is to learn a linear shift-invariant operator whose response to the desired sequence is (approximately) zero-yielding a set of null-space equations that the desired signal should satisfy, without the need for the accompanying data. The reconstruction accuracy can be further improved by applying the second phase iteratively. We evaluate ARES on the real epidemiological data from the Tycho project and demonstrate that ARES recovers historical data from aggregated reports with high accuracy. In particular, it considerably outperforms top competitors, including least squares approximation and the more advanced H-FUSE method (42% and 34% improvement based on average RMSE, respectively).
ISSN:2375-026X
DOI:10.1109/ICDE.2018.00016