Loading…

A hybrid system based on ensemble learning to model residuals for time series forecasting

The time series forecasting literature has highlighted the accuracy of hybrid systems that combine statistical linear and Machine Learning (ML) models by modeling the residuals. These systems separately model linear and nonlinear patterns aiming to overcome the limitations of using only a single mod...

Full description

Saved in:
Bibliographic Details
Published in:Information sciences 2023-11, Vol.649, p.119614, Article 119614
Main Authors: Santos Júnior, Domingos S. de O., de Mattos Neto, Paulo S.G., de Oliveira, João F.L., Cavalcanti, George D.C.
Format: Article
Language:English
Subjects:
Citations: Items that this one cites
Items that cite this one
Online Access:Get full text
Tags: Add Tag
No Tags, Be the first to tag this record!
Description
Summary:The time series forecasting literature has highlighted the accuracy of hybrid systems that combine statistical linear and Machine Learning (ML) models by modeling the residuals. These systems separately model linear and nonlinear patterns aiming to overcome the limitations of using only a single model. This system comprises three phases: linear modeling of the time series, forecasting the residuals using an ML model, and final forecasting through the combination of past phases. Modeling the residuals is challenging because the residuals may present heteroscedasticity, complex nonlinear patterns, and random fluctuations. Hence, specifying a single ML model is a complex task. This work proposes a hybrid system that combines a linear statistical model with an ensemble of ML models to forecast real-world time series. The proposed method employs an ensemble in the phase of modeling the residuals, aiming at: improving the generalization capacity of the system, reducing the risk of selecting an incorrect model, expanding the function space, and increasing the system's accuracy. Moreover, for each time series, a data-driven search is carried out for the parameters of the ensemble that will be the most suitable for that time series. The experimental results show that the proposal attains superior performance and is statistically better than the related systems in the literature.
ISSN:0020-0255
1872-6291
DOI:10.1016/j.ins.2023.119614