Loading…

SETAR-Tree: a novel and accurate tree algorithm for global time series forecasting

Threshold Autoregressive (TAR) models have been widely used by statisticians for non-linear time series forecasting during the past few decades, due to their simplicity and mathematical properties. On the other hand, in the forecasting community, general-purpose tree-based regression algorithms (for...

Full description

Saved in:
Bibliographic Details
Published in:Machine learning 2023-07, Vol.112 (7), p.2555-2591
Main Authors: Godahewa, Rakshitha, Webb, Geoffrey I., Schmidt, Daniel, Bergmeir, Christoph
Format: Article
Language:English
Subjects:
Citations: Items that this one cites
Items that cite this one
Online Access:Get full text
Tags: Add Tag
No Tags, Be the first to tag this record!
cited_by cdi_FETCH-LOGICAL-c363t-e1c5386b04d67e132eca7e6a70f3904fdad0f17b9551554993ed7131379f6eda3
cites cdi_FETCH-LOGICAL-c363t-e1c5386b04d67e132eca7e6a70f3904fdad0f17b9551554993ed7131379f6eda3
container_end_page 2591
container_issue 7
container_start_page 2555
container_title Machine learning
container_volume 112
creator Godahewa, Rakshitha
Webb, Geoffrey I.
Schmidt, Daniel
Bergmeir, Christoph
description Threshold Autoregressive (TAR) models have been widely used by statisticians for non-linear time series forecasting during the past few decades, due to their simplicity and mathematical properties. On the other hand, in the forecasting community, general-purpose tree-based regression algorithms (forests, gradient-boosting) have become popular recently due to their ease of use and accuracy. In this paper, we explore the close connections between TAR models and regression trees. These enable us to use the rich methodology from the literature on TAR models to define a hierarchical TAR model as a regression tree that trains globally across series, which we call SETAR-Tree. In contrast to the general-purpose tree-based models that do not primarily focus on forecasting, and calculate averages at the leaf nodes, we introduce a new forecasting-specific tree algorithm that trains global Pooled Regression (PR) models in the leaves allowing the models to learn cross-series information and also uses some time-series-specific splitting and stopping procedures. The depth of the tree is controlled by conducting a statistical linearity test commonly employed in TAR models, as well as measuring the error reduction percentage at each node split. Thus, the proposed tree model requires minimal external hyperparameter tuning and provides competitive results under its default configuration. We also use this tree algorithm to develop a forest where the forecasts provided by a collection of diverse SETAR-Trees are combined during the forecasting process. In our evaluation on eight publicly available datasets, the proposed tree and forest models are able to achieve significantly higher accuracy than a set of state-of-the-art tree-based algorithms and forecasting benchmarks across four evaluation metrics.
doi_str_mv 10.1007/s10994-023-06316-x
format article
fullrecord <record><control><sourceid>proquest_cross</sourceid><recordid>TN_cdi_proquest_journals_2837223078</recordid><sourceformat>XML</sourceformat><sourcesystem>PC</sourcesystem><sourcerecordid>2837223078</sourcerecordid><originalsourceid>FETCH-LOGICAL-c363t-e1c5386b04d67e132eca7e6a70f3904fdad0f17b9551554993ed7131379f6eda3</originalsourceid><addsrcrecordid>eNp9kN9LwzAQx4MoOKf_gE8Bn6OXpEla38aYP2AgzPkcsvZaO7p2Jq3M_97MCr75dNzd93MHH0KuOdxyAHMXOGRZwkBIBlpyzQ4nZMKVia3S6pRMIE0V01yoc3IRwhYAhE71hKxeF-vZiq094j11tO0-saGuLajL88G7HmkfV9Q1Vefr_n1Hy87Tquk2rqF9vUMa0NcYjmPMXejrtrokZ6VrAl791il5e1is509s-fL4PJ8tWS617BnyXMlUbyAptEEuReQNameglBkkZeEKKLnZZEpxpZIsk1gYLrk0WamxcHJKbsa7e999DBh6u-0G38aXVqTSCCHBpDElxlTuuxA8lnbv653zX5aDPbqzozsb3dkfd_YQITlCIYbbCv3f6X-ob1K7cXk</addsrcrecordid><sourcetype>Aggregation Database</sourcetype><iscdi>true</iscdi><recordtype>article</recordtype><pqid>2837223078</pqid></control><display><type>article</type><title>SETAR-Tree: a novel and accurate tree algorithm for global time series forecasting</title><source>Springer Link</source><creator>Godahewa, Rakshitha ; Webb, Geoffrey I. ; Schmidt, Daniel ; Bergmeir, Christoph</creator><creatorcontrib>Godahewa, Rakshitha ; Webb, Geoffrey I. ; Schmidt, Daniel ; Bergmeir, Christoph</creatorcontrib><description>Threshold Autoregressive (TAR) models have been widely used by statisticians for non-linear time series forecasting during the past few decades, due to their simplicity and mathematical properties. On the other hand, in the forecasting community, general-purpose tree-based regression algorithms (forests, gradient-boosting) have become popular recently due to their ease of use and accuracy. In this paper, we explore the close connections between TAR models and regression trees. These enable us to use the rich methodology from the literature on TAR models to define a hierarchical TAR model as a regression tree that trains globally across series, which we call SETAR-Tree. In contrast to the general-purpose tree-based models that do not primarily focus on forecasting, and calculate averages at the leaf nodes, we introduce a new forecasting-specific tree algorithm that trains global Pooled Regression (PR) models in the leaves allowing the models to learn cross-series information and also uses some time-series-specific splitting and stopping procedures. The depth of the tree is controlled by conducting a statistical linearity test commonly employed in TAR models, as well as measuring the error reduction percentage at each node split. Thus, the proposed tree model requires minimal external hyperparameter tuning and provides competitive results under its default configuration. We also use this tree algorithm to develop a forest where the forecasts provided by a collection of diverse SETAR-Trees are combined during the forecasting process. In our evaluation on eight publicly available datasets, the proposed tree and forest models are able to achieve significantly higher accuracy than a set of state-of-the-art tree-based algorithms and forecasting benchmarks across four evaluation metrics.</description><identifier>ISSN: 0885-6125</identifier><identifier>EISSN: 1573-0565</identifier><identifier>DOI: 10.1007/s10994-023-06316-x</identifier><language>eng</language><publisher>New York: Springer US</publisher><subject>Accuracy ; Algorithms ; Artificial Intelligence ; Autoregressive models ; Computer Science ; Control ; Error analysis ; Error reduction ; Forecasting ; Linearity ; Machine Learning ; Mechatronics ; Natural Language Processing (NLP) ; Regression analysis ; Regression models ; Robotics ; Simulation and Modeling ; Special Issue of the ECML PKDD 2022 Journal Track ; Statistical analysis ; Time series</subject><ispartof>Machine learning, 2023-07, Vol.112 (7), p.2555-2591</ispartof><rights>The Author(s) 2023</rights><rights>The Author(s) 2023. This work is published under http://creativecommons.org/licenses/by/4.0/ (the “License”). Notwithstanding the ProQuest Terms and Conditions, you may use this content in accordance with the terms of the License.</rights><lds50>peer_reviewed</lds50><oa>free_for_read</oa><woscitedreferencessubscribed>false</woscitedreferencessubscribed><citedby>FETCH-LOGICAL-c363t-e1c5386b04d67e132eca7e6a70f3904fdad0f17b9551554993ed7131379f6eda3</citedby><cites>FETCH-LOGICAL-c363t-e1c5386b04d67e132eca7e6a70f3904fdad0f17b9551554993ed7131379f6eda3</cites><orcidid>0000-0002-1333-7249</orcidid></display><links><openurl>$$Topenurl_article</openurl><openurlfulltext>$$Topenurlfull_article</openurlfulltext><thumbnail>$$Tsyndetics_thumb_exl</thumbnail><link.rule.ids>314,780,784,27924,27925</link.rule.ids></links><search><creatorcontrib>Godahewa, Rakshitha</creatorcontrib><creatorcontrib>Webb, Geoffrey I.</creatorcontrib><creatorcontrib>Schmidt, Daniel</creatorcontrib><creatorcontrib>Bergmeir, Christoph</creatorcontrib><title>SETAR-Tree: a novel and accurate tree algorithm for global time series forecasting</title><title>Machine learning</title><addtitle>Mach Learn</addtitle><description>Threshold Autoregressive (TAR) models have been widely used by statisticians for non-linear time series forecasting during the past few decades, due to their simplicity and mathematical properties. On the other hand, in the forecasting community, general-purpose tree-based regression algorithms (forests, gradient-boosting) have become popular recently due to their ease of use and accuracy. In this paper, we explore the close connections between TAR models and regression trees. These enable us to use the rich methodology from the literature on TAR models to define a hierarchical TAR model as a regression tree that trains globally across series, which we call SETAR-Tree. In contrast to the general-purpose tree-based models that do not primarily focus on forecasting, and calculate averages at the leaf nodes, we introduce a new forecasting-specific tree algorithm that trains global Pooled Regression (PR) models in the leaves allowing the models to learn cross-series information and also uses some time-series-specific splitting and stopping procedures. The depth of the tree is controlled by conducting a statistical linearity test commonly employed in TAR models, as well as measuring the error reduction percentage at each node split. Thus, the proposed tree model requires minimal external hyperparameter tuning and provides competitive results under its default configuration. We also use this tree algorithm to develop a forest where the forecasts provided by a collection of diverse SETAR-Trees are combined during the forecasting process. In our evaluation on eight publicly available datasets, the proposed tree and forest models are able to achieve significantly higher accuracy than a set of state-of-the-art tree-based algorithms and forecasting benchmarks across four evaluation metrics.</description><subject>Accuracy</subject><subject>Algorithms</subject><subject>Artificial Intelligence</subject><subject>Autoregressive models</subject><subject>Computer Science</subject><subject>Control</subject><subject>Error analysis</subject><subject>Error reduction</subject><subject>Forecasting</subject><subject>Linearity</subject><subject>Machine Learning</subject><subject>Mechatronics</subject><subject>Natural Language Processing (NLP)</subject><subject>Regression analysis</subject><subject>Regression models</subject><subject>Robotics</subject><subject>Simulation and Modeling</subject><subject>Special Issue of the ECML PKDD 2022 Journal Track</subject><subject>Statistical analysis</subject><subject>Time series</subject><issn>0885-6125</issn><issn>1573-0565</issn><fulltext>true</fulltext><rsrctype>article</rsrctype><creationdate>2023</creationdate><recordtype>article</recordtype><recordid>eNp9kN9LwzAQx4MoOKf_gE8Bn6OXpEla38aYP2AgzPkcsvZaO7p2Jq3M_97MCr75dNzd93MHH0KuOdxyAHMXOGRZwkBIBlpyzQ4nZMKVia3S6pRMIE0V01yoc3IRwhYAhE71hKxeF-vZiq094j11tO0-saGuLajL88G7HmkfV9Q1Vefr_n1Hy87Tquk2rqF9vUMa0NcYjmPMXejrtrokZ6VrAl791il5e1is509s-fL4PJ8tWS617BnyXMlUbyAptEEuReQNameglBkkZeEKKLnZZEpxpZIsk1gYLrk0WamxcHJKbsa7e999DBh6u-0G38aXVqTSCCHBpDElxlTuuxA8lnbv653zX5aDPbqzozsb3dkfd_YQITlCIYbbCv3f6X-ob1K7cXk</recordid><startdate>20230701</startdate><enddate>20230701</enddate><creator>Godahewa, Rakshitha</creator><creator>Webb, Geoffrey I.</creator><creator>Schmidt, Daniel</creator><creator>Bergmeir, Christoph</creator><general>Springer US</general><general>Springer Nature B.V</general><scope>C6C</scope><scope>AAYXX</scope><scope>CITATION</scope><scope>3V.</scope><scope>7SC</scope><scope>7XB</scope><scope>88I</scope><scope>8AL</scope><scope>8AO</scope><scope>8FD</scope><scope>8FE</scope><scope>8FG</scope><scope>8FK</scope><scope>ABUWG</scope><scope>AFKRA</scope><scope>ARAPS</scope><scope>AZQEC</scope><scope>BENPR</scope><scope>BGLVJ</scope><scope>CCPQU</scope><scope>DWQXO</scope><scope>GNUQQ</scope><scope>HCIFZ</scope><scope>JQ2</scope><scope>K7-</scope><scope>L7M</scope><scope>L~C</scope><scope>L~D</scope><scope>M0N</scope><scope>M2P</scope><scope>P5Z</scope><scope>P62</scope><scope>PQEST</scope><scope>PQQKQ</scope><scope>PQUKI</scope><scope>Q9U</scope><orcidid>https://orcid.org/0000-0002-1333-7249</orcidid></search><sort><creationdate>20230701</creationdate><title>SETAR-Tree: a novel and accurate tree algorithm for global time series forecasting</title><author>Godahewa, Rakshitha ; Webb, Geoffrey I. ; Schmidt, Daniel ; Bergmeir, Christoph</author></sort><facets><frbrtype>5</frbrtype><frbrgroupid>cdi_FETCH-LOGICAL-c363t-e1c5386b04d67e132eca7e6a70f3904fdad0f17b9551554993ed7131379f6eda3</frbrgroupid><rsrctype>articles</rsrctype><prefilter>articles</prefilter><language>eng</language><creationdate>2023</creationdate><topic>Accuracy</topic><topic>Algorithms</topic><topic>Artificial Intelligence</topic><topic>Autoregressive models</topic><topic>Computer Science</topic><topic>Control</topic><topic>Error analysis</topic><topic>Error reduction</topic><topic>Forecasting</topic><topic>Linearity</topic><topic>Machine Learning</topic><topic>Mechatronics</topic><topic>Natural Language Processing (NLP)</topic><topic>Regression analysis</topic><topic>Regression models</topic><topic>Robotics</topic><topic>Simulation and Modeling</topic><topic>Special Issue of the ECML PKDD 2022 Journal Track</topic><topic>Statistical analysis</topic><topic>Time series</topic><toplevel>peer_reviewed</toplevel><toplevel>online_resources</toplevel><creatorcontrib>Godahewa, Rakshitha</creatorcontrib><creatorcontrib>Webb, Geoffrey I.</creatorcontrib><creatorcontrib>Schmidt, Daniel</creatorcontrib><creatorcontrib>Bergmeir, Christoph</creatorcontrib><collection>Springer Nature OA Free Journals</collection><collection>CrossRef</collection><collection>ProQuest Central (Corporate)</collection><collection>Computer and Information Systems Abstracts</collection><collection>ProQuest Central (purchase pre-March 2016)</collection><collection>Science Database (Alumni Edition)</collection><collection>Computing Database (Alumni Edition)</collection><collection>ProQuest Pharma Collection</collection><collection>Technology Research Database</collection><collection>ProQuest SciTech Collection</collection><collection>ProQuest Technology Collection</collection><collection>ProQuest Central (Alumni) (purchase pre-March 2016)</collection><collection>ProQuest Central (Alumni)</collection><collection>ProQuest Central</collection><collection>Advanced Technologies &amp; Aerospace Collection</collection><collection>ProQuest Central Essentials</collection><collection>AUTh Library subscriptions: ProQuest Central</collection><collection>Technology Collection</collection><collection>ProQuest One Community College</collection><collection>ProQuest Central Korea</collection><collection>ProQuest Central Student</collection><collection>SciTech Premium Collection</collection><collection>ProQuest Computer Science Collection</collection><collection>Computer Science Database</collection><collection>Advanced Technologies Database with Aerospace</collection><collection>Computer and Information Systems Abstracts – Academic</collection><collection>Computer and Information Systems Abstracts Professional</collection><collection>Computing Database</collection><collection>Science Database</collection><collection>Advanced Technologies &amp; Aerospace Database</collection><collection>ProQuest Advanced Technologies &amp; Aerospace Collection</collection><collection>ProQuest One Academic Eastern Edition (DO NOT USE)</collection><collection>ProQuest One Academic</collection><collection>ProQuest One Academic UKI Edition</collection><collection>ProQuest Central Basic</collection><jtitle>Machine learning</jtitle></facets><delivery><delcategory>Remote Search Resource</delcategory><fulltext>fulltext</fulltext></delivery><addata><au>Godahewa, Rakshitha</au><au>Webb, Geoffrey I.</au><au>Schmidt, Daniel</au><au>Bergmeir, Christoph</au><format>journal</format><genre>article</genre><ristype>JOUR</ristype><atitle>SETAR-Tree: a novel and accurate tree algorithm for global time series forecasting</atitle><jtitle>Machine learning</jtitle><stitle>Mach Learn</stitle><date>2023-07-01</date><risdate>2023</risdate><volume>112</volume><issue>7</issue><spage>2555</spage><epage>2591</epage><pages>2555-2591</pages><issn>0885-6125</issn><eissn>1573-0565</eissn><abstract>Threshold Autoregressive (TAR) models have been widely used by statisticians for non-linear time series forecasting during the past few decades, due to their simplicity and mathematical properties. On the other hand, in the forecasting community, general-purpose tree-based regression algorithms (forests, gradient-boosting) have become popular recently due to their ease of use and accuracy. In this paper, we explore the close connections between TAR models and regression trees. These enable us to use the rich methodology from the literature on TAR models to define a hierarchical TAR model as a regression tree that trains globally across series, which we call SETAR-Tree. In contrast to the general-purpose tree-based models that do not primarily focus on forecasting, and calculate averages at the leaf nodes, we introduce a new forecasting-specific tree algorithm that trains global Pooled Regression (PR) models in the leaves allowing the models to learn cross-series information and also uses some time-series-specific splitting and stopping procedures. The depth of the tree is controlled by conducting a statistical linearity test commonly employed in TAR models, as well as measuring the error reduction percentage at each node split. Thus, the proposed tree model requires minimal external hyperparameter tuning and provides competitive results under its default configuration. We also use this tree algorithm to develop a forest where the forecasts provided by a collection of diverse SETAR-Trees are combined during the forecasting process. In our evaluation on eight publicly available datasets, the proposed tree and forest models are able to achieve significantly higher accuracy than a set of state-of-the-art tree-based algorithms and forecasting benchmarks across four evaluation metrics.</abstract><cop>New York</cop><pub>Springer US</pub><doi>10.1007/s10994-023-06316-x</doi><tpages>37</tpages><orcidid>https://orcid.org/0000-0002-1333-7249</orcidid><oa>free_for_read</oa></addata></record>
fulltext fulltext
identifier ISSN: 0885-6125
ispartof Machine learning, 2023-07, Vol.112 (7), p.2555-2591
issn 0885-6125
1573-0565
language eng
recordid cdi_proquest_journals_2837223078
source Springer Link
subjects Accuracy
Algorithms
Artificial Intelligence
Autoregressive models
Computer Science
Control
Error analysis
Error reduction
Forecasting
Linearity
Machine Learning
Mechatronics
Natural Language Processing (NLP)
Regression analysis
Regression models
Robotics
Simulation and Modeling
Special Issue of the ECML PKDD 2022 Journal Track
Statistical analysis
Time series
title SETAR-Tree: a novel and accurate tree algorithm for global time series forecasting
url http://sfxeu10.hosted.exlibrisgroup.com/loughborough?ctx_ver=Z39.88-2004&ctx_enc=info:ofi/enc:UTF-8&ctx_tim=2025-01-04T16%3A03%3A24IST&url_ver=Z39.88-2004&url_ctx_fmt=infofi/fmt:kev:mtx:ctx&rfr_id=info:sid/primo.exlibrisgroup.com:primo3-Article-proquest_cross&rft_val_fmt=info:ofi/fmt:kev:mtx:journal&rft.genre=article&rft.atitle=SETAR-Tree:%20a%20novel%20and%20accurate%20tree%20algorithm%20for%20global%20time%20series%20forecasting&rft.jtitle=Machine%20learning&rft.au=Godahewa,%20Rakshitha&rft.date=2023-07-01&rft.volume=112&rft.issue=7&rft.spage=2555&rft.epage=2591&rft.pages=2555-2591&rft.issn=0885-6125&rft.eissn=1573-0565&rft_id=info:doi/10.1007/s10994-023-06316-x&rft_dat=%3Cproquest_cross%3E2837223078%3C/proquest_cross%3E%3Cgrp_id%3Ecdi_FETCH-LOGICAL-c363t-e1c5386b04d67e132eca7e6a70f3904fdad0f17b9551554993ed7131379f6eda3%3C/grp_id%3E%3Coa%3E%3C/oa%3E%3Curl%3E%3C/url%3E&rft_id=info:oai/&rft_pqid=2837223078&rft_id=info:pmid/&rfr_iscdi=true