Loading…
SETAR-Tree: a novel and accurate tree algorithm for global time series forecasting
Threshold Autoregressive (TAR) models have been widely used by statisticians for non-linear time series forecasting during the past few decades, due to their simplicity and mathematical properties. On the other hand, in the forecasting community, general-purpose tree-based regression algorithms (for...
Saved in:
Published in: | Machine learning 2023-07, Vol.112 (7), p.2555-2591 |
---|---|
Main Authors: | , , , |
Format: | Article |
Language: | English |
Subjects: | |
Citations: | Items that this one cites Items that cite this one |
Online Access: | Get full text |
Tags: |
Add Tag
No Tags, Be the first to tag this record!
|
cited_by | cdi_FETCH-LOGICAL-c363t-e1c5386b04d67e132eca7e6a70f3904fdad0f17b9551554993ed7131379f6eda3 |
---|---|
cites | cdi_FETCH-LOGICAL-c363t-e1c5386b04d67e132eca7e6a70f3904fdad0f17b9551554993ed7131379f6eda3 |
container_end_page | 2591 |
container_issue | 7 |
container_start_page | 2555 |
container_title | Machine learning |
container_volume | 112 |
creator | Godahewa, Rakshitha Webb, Geoffrey I. Schmidt, Daniel Bergmeir, Christoph |
description | Threshold Autoregressive (TAR) models have been widely used by statisticians for non-linear time series forecasting during the past few decades, due to their simplicity and mathematical properties. On the other hand, in the forecasting community, general-purpose tree-based regression algorithms (forests, gradient-boosting) have become popular recently due to their ease of use and accuracy. In this paper, we explore the close connections between TAR models and regression trees. These enable us to use the rich methodology from the literature on TAR models to define a hierarchical TAR model as a regression tree that trains globally across series, which we call SETAR-Tree. In contrast to the general-purpose tree-based models that do not primarily focus on forecasting, and calculate averages at the leaf nodes, we introduce a new forecasting-specific tree algorithm that trains global Pooled Regression (PR) models in the leaves allowing the models to learn cross-series information and also uses some time-series-specific splitting and stopping procedures. The depth of the tree is controlled by conducting a statistical linearity test commonly employed in TAR models, as well as measuring the error reduction percentage at each node split. Thus, the proposed tree model requires minimal external hyperparameter tuning and provides competitive results under its default configuration. We also use this tree algorithm to develop a forest where the forecasts provided by a collection of diverse SETAR-Trees are combined during the forecasting process. In our evaluation on eight publicly available datasets, the proposed tree and forest models are able to achieve significantly higher accuracy than a set of state-of-the-art tree-based algorithms and forecasting benchmarks across four evaluation metrics. |
doi_str_mv | 10.1007/s10994-023-06316-x |
format | article |
fullrecord | <record><control><sourceid>proquest_cross</sourceid><recordid>TN_cdi_proquest_journals_2837223078</recordid><sourceformat>XML</sourceformat><sourcesystem>PC</sourcesystem><sourcerecordid>2837223078</sourcerecordid><originalsourceid>FETCH-LOGICAL-c363t-e1c5386b04d67e132eca7e6a70f3904fdad0f17b9551554993ed7131379f6eda3</originalsourceid><addsrcrecordid>eNp9kN9LwzAQx4MoOKf_gE8Bn6OXpEla38aYP2AgzPkcsvZaO7p2Jq3M_97MCr75dNzd93MHH0KuOdxyAHMXOGRZwkBIBlpyzQ4nZMKVia3S6pRMIE0V01yoc3IRwhYAhE71hKxeF-vZiq094j11tO0-saGuLajL88G7HmkfV9Q1Vefr_n1Hy87Tquk2rqF9vUMa0NcYjmPMXejrtrokZ6VrAl791il5e1is509s-fL4PJ8tWS617BnyXMlUbyAptEEuReQNameglBkkZeEKKLnZZEpxpZIsk1gYLrk0WamxcHJKbsa7e999DBh6u-0G38aXVqTSCCHBpDElxlTuuxA8lnbv653zX5aDPbqzozsb3dkfd_YQITlCIYbbCv3f6X-ob1K7cXk</addsrcrecordid><sourcetype>Aggregation Database</sourcetype><iscdi>true</iscdi><recordtype>article</recordtype><pqid>2837223078</pqid></control><display><type>article</type><title>SETAR-Tree: a novel and accurate tree algorithm for global time series forecasting</title><source>Springer Link</source><creator>Godahewa, Rakshitha ; Webb, Geoffrey I. ; Schmidt, Daniel ; Bergmeir, Christoph</creator><creatorcontrib>Godahewa, Rakshitha ; Webb, Geoffrey I. ; Schmidt, Daniel ; Bergmeir, Christoph</creatorcontrib><description>Threshold Autoregressive (TAR) models have been widely used by statisticians for non-linear time series forecasting during the past few decades, due to their simplicity and mathematical properties. On the other hand, in the forecasting community, general-purpose tree-based regression algorithms (forests, gradient-boosting) have become popular recently due to their ease of use and accuracy. In this paper, we explore the close connections between TAR models and regression trees. These enable us to use the rich methodology from the literature on TAR models to define a hierarchical TAR model as a regression tree that trains globally across series, which we call SETAR-Tree. In contrast to the general-purpose tree-based models that do not primarily focus on forecasting, and calculate averages at the leaf nodes, we introduce a new forecasting-specific tree algorithm that trains global Pooled Regression (PR) models in the leaves allowing the models to learn cross-series information and also uses some time-series-specific splitting and stopping procedures. The depth of the tree is controlled by conducting a statistical linearity test commonly employed in TAR models, as well as measuring the error reduction percentage at each node split. Thus, the proposed tree model requires minimal external hyperparameter tuning and provides competitive results under its default configuration. We also use this tree algorithm to develop a forest where the forecasts provided by a collection of diverse SETAR-Trees are combined during the forecasting process. In our evaluation on eight publicly available datasets, the proposed tree and forest models are able to achieve significantly higher accuracy than a set of state-of-the-art tree-based algorithms and forecasting benchmarks across four evaluation metrics.</description><identifier>ISSN: 0885-6125</identifier><identifier>EISSN: 1573-0565</identifier><identifier>DOI: 10.1007/s10994-023-06316-x</identifier><language>eng</language><publisher>New York: Springer US</publisher><subject>Accuracy ; Algorithms ; Artificial Intelligence ; Autoregressive models ; Computer Science ; Control ; Error analysis ; Error reduction ; Forecasting ; Linearity ; Machine Learning ; Mechatronics ; Natural Language Processing (NLP) ; Regression analysis ; Regression models ; Robotics ; Simulation and Modeling ; Special Issue of the ECML PKDD 2022 Journal Track ; Statistical analysis ; Time series</subject><ispartof>Machine learning, 2023-07, Vol.112 (7), p.2555-2591</ispartof><rights>The Author(s) 2023</rights><rights>The Author(s) 2023. This work is published under http://creativecommons.org/licenses/by/4.0/ (the “License”). Notwithstanding the ProQuest Terms and Conditions, you may use this content in accordance with the terms of the License.</rights><lds50>peer_reviewed</lds50><oa>free_for_read</oa><woscitedreferencessubscribed>false</woscitedreferencessubscribed><citedby>FETCH-LOGICAL-c363t-e1c5386b04d67e132eca7e6a70f3904fdad0f17b9551554993ed7131379f6eda3</citedby><cites>FETCH-LOGICAL-c363t-e1c5386b04d67e132eca7e6a70f3904fdad0f17b9551554993ed7131379f6eda3</cites><orcidid>0000-0002-1333-7249</orcidid></display><links><openurl>$$Topenurl_article</openurl><openurlfulltext>$$Topenurlfull_article</openurlfulltext><thumbnail>$$Tsyndetics_thumb_exl</thumbnail><link.rule.ids>314,780,784,27924,27925</link.rule.ids></links><search><creatorcontrib>Godahewa, Rakshitha</creatorcontrib><creatorcontrib>Webb, Geoffrey I.</creatorcontrib><creatorcontrib>Schmidt, Daniel</creatorcontrib><creatorcontrib>Bergmeir, Christoph</creatorcontrib><title>SETAR-Tree: a novel and accurate tree algorithm for global time series forecasting</title><title>Machine learning</title><addtitle>Mach Learn</addtitle><description>Threshold Autoregressive (TAR) models have been widely used by statisticians for non-linear time series forecasting during the past few decades, due to their simplicity and mathematical properties. On the other hand, in the forecasting community, general-purpose tree-based regression algorithms (forests, gradient-boosting) have become popular recently due to their ease of use and accuracy. In this paper, we explore the close connections between TAR models and regression trees. These enable us to use the rich methodology from the literature on TAR models to define a hierarchical TAR model as a regression tree that trains globally across series, which we call SETAR-Tree. In contrast to the general-purpose tree-based models that do not primarily focus on forecasting, and calculate averages at the leaf nodes, we introduce a new forecasting-specific tree algorithm that trains global Pooled Regression (PR) models in the leaves allowing the models to learn cross-series information and also uses some time-series-specific splitting and stopping procedures. The depth of the tree is controlled by conducting a statistical linearity test commonly employed in TAR models, as well as measuring the error reduction percentage at each node split. Thus, the proposed tree model requires minimal external hyperparameter tuning and provides competitive results under its default configuration. We also use this tree algorithm to develop a forest where the forecasts provided by a collection of diverse SETAR-Trees are combined during the forecasting process. In our evaluation on eight publicly available datasets, the proposed tree and forest models are able to achieve significantly higher accuracy than a set of state-of-the-art tree-based algorithms and forecasting benchmarks across four evaluation metrics.</description><subject>Accuracy</subject><subject>Algorithms</subject><subject>Artificial Intelligence</subject><subject>Autoregressive models</subject><subject>Computer Science</subject><subject>Control</subject><subject>Error analysis</subject><subject>Error reduction</subject><subject>Forecasting</subject><subject>Linearity</subject><subject>Machine Learning</subject><subject>Mechatronics</subject><subject>Natural Language Processing (NLP)</subject><subject>Regression analysis</subject><subject>Regression models</subject><subject>Robotics</subject><subject>Simulation and Modeling</subject><subject>Special Issue of the ECML PKDD 2022 Journal Track</subject><subject>Statistical analysis</subject><subject>Time series</subject><issn>0885-6125</issn><issn>1573-0565</issn><fulltext>true</fulltext><rsrctype>article</rsrctype><creationdate>2023</creationdate><recordtype>article</recordtype><recordid>eNp9kN9LwzAQx4MoOKf_gE8Bn6OXpEla38aYP2AgzPkcsvZaO7p2Jq3M_97MCr75dNzd93MHH0KuOdxyAHMXOGRZwkBIBlpyzQ4nZMKVia3S6pRMIE0V01yoc3IRwhYAhE71hKxeF-vZiq094j11tO0-saGuLajL88G7HmkfV9Q1Vefr_n1Hy87Tquk2rqF9vUMa0NcYjmPMXejrtrokZ6VrAl791il5e1is509s-fL4PJ8tWS617BnyXMlUbyAptEEuReQNameglBkkZeEKKLnZZEpxpZIsk1gYLrk0WamxcHJKbsa7e999DBh6u-0G38aXVqTSCCHBpDElxlTuuxA8lnbv653zX5aDPbqzozsb3dkfd_YQITlCIYbbCv3f6X-ob1K7cXk</recordid><startdate>20230701</startdate><enddate>20230701</enddate><creator>Godahewa, Rakshitha</creator><creator>Webb, Geoffrey I.</creator><creator>Schmidt, Daniel</creator><creator>Bergmeir, Christoph</creator><general>Springer US</general><general>Springer Nature B.V</general><scope>C6C</scope><scope>AAYXX</scope><scope>CITATION</scope><scope>3V.</scope><scope>7SC</scope><scope>7XB</scope><scope>88I</scope><scope>8AL</scope><scope>8AO</scope><scope>8FD</scope><scope>8FE</scope><scope>8FG</scope><scope>8FK</scope><scope>ABUWG</scope><scope>AFKRA</scope><scope>ARAPS</scope><scope>AZQEC</scope><scope>BENPR</scope><scope>BGLVJ</scope><scope>CCPQU</scope><scope>DWQXO</scope><scope>GNUQQ</scope><scope>HCIFZ</scope><scope>JQ2</scope><scope>K7-</scope><scope>L7M</scope><scope>L~C</scope><scope>L~D</scope><scope>M0N</scope><scope>M2P</scope><scope>P5Z</scope><scope>P62</scope><scope>PQEST</scope><scope>PQQKQ</scope><scope>PQUKI</scope><scope>Q9U</scope><orcidid>https://orcid.org/0000-0002-1333-7249</orcidid></search><sort><creationdate>20230701</creationdate><title>SETAR-Tree: a novel and accurate tree algorithm for global time series forecasting</title><author>Godahewa, Rakshitha ; Webb, Geoffrey I. ; Schmidt, Daniel ; Bergmeir, Christoph</author></sort><facets><frbrtype>5</frbrtype><frbrgroupid>cdi_FETCH-LOGICAL-c363t-e1c5386b04d67e132eca7e6a70f3904fdad0f17b9551554993ed7131379f6eda3</frbrgroupid><rsrctype>articles</rsrctype><prefilter>articles</prefilter><language>eng</language><creationdate>2023</creationdate><topic>Accuracy</topic><topic>Algorithms</topic><topic>Artificial Intelligence</topic><topic>Autoregressive models</topic><topic>Computer Science</topic><topic>Control</topic><topic>Error analysis</topic><topic>Error reduction</topic><topic>Forecasting</topic><topic>Linearity</topic><topic>Machine Learning</topic><topic>Mechatronics</topic><topic>Natural Language Processing (NLP)</topic><topic>Regression analysis</topic><topic>Regression models</topic><topic>Robotics</topic><topic>Simulation and Modeling</topic><topic>Special Issue of the ECML PKDD 2022 Journal Track</topic><topic>Statistical analysis</topic><topic>Time series</topic><toplevel>peer_reviewed</toplevel><toplevel>online_resources</toplevel><creatorcontrib>Godahewa, Rakshitha</creatorcontrib><creatorcontrib>Webb, Geoffrey I.</creatorcontrib><creatorcontrib>Schmidt, Daniel</creatorcontrib><creatorcontrib>Bergmeir, Christoph</creatorcontrib><collection>Springer Nature OA Free Journals</collection><collection>CrossRef</collection><collection>ProQuest Central (Corporate)</collection><collection>Computer and Information Systems Abstracts</collection><collection>ProQuest Central (purchase pre-March 2016)</collection><collection>Science Database (Alumni Edition)</collection><collection>Computing Database (Alumni Edition)</collection><collection>ProQuest Pharma Collection</collection><collection>Technology Research Database</collection><collection>ProQuest SciTech Collection</collection><collection>ProQuest Technology Collection</collection><collection>ProQuest Central (Alumni) (purchase pre-March 2016)</collection><collection>ProQuest Central (Alumni)</collection><collection>ProQuest Central</collection><collection>Advanced Technologies & Aerospace Collection</collection><collection>ProQuest Central Essentials</collection><collection>AUTh Library subscriptions: ProQuest Central</collection><collection>Technology Collection</collection><collection>ProQuest One Community College</collection><collection>ProQuest Central Korea</collection><collection>ProQuest Central Student</collection><collection>SciTech Premium Collection</collection><collection>ProQuest Computer Science Collection</collection><collection>Computer Science Database</collection><collection>Advanced Technologies Database with Aerospace</collection><collection>Computer and Information Systems Abstracts Academic</collection><collection>Computer and Information Systems Abstracts Professional</collection><collection>Computing Database</collection><collection>Science Database</collection><collection>Advanced Technologies & Aerospace Database</collection><collection>ProQuest Advanced Technologies & Aerospace Collection</collection><collection>ProQuest One Academic Eastern Edition (DO NOT USE)</collection><collection>ProQuest One Academic</collection><collection>ProQuest One Academic UKI Edition</collection><collection>ProQuest Central Basic</collection><jtitle>Machine learning</jtitle></facets><delivery><delcategory>Remote Search Resource</delcategory><fulltext>fulltext</fulltext></delivery><addata><au>Godahewa, Rakshitha</au><au>Webb, Geoffrey I.</au><au>Schmidt, Daniel</au><au>Bergmeir, Christoph</au><format>journal</format><genre>article</genre><ristype>JOUR</ristype><atitle>SETAR-Tree: a novel and accurate tree algorithm for global time series forecasting</atitle><jtitle>Machine learning</jtitle><stitle>Mach Learn</stitle><date>2023-07-01</date><risdate>2023</risdate><volume>112</volume><issue>7</issue><spage>2555</spage><epage>2591</epage><pages>2555-2591</pages><issn>0885-6125</issn><eissn>1573-0565</eissn><abstract>Threshold Autoregressive (TAR) models have been widely used by statisticians for non-linear time series forecasting during the past few decades, due to their simplicity and mathematical properties. On the other hand, in the forecasting community, general-purpose tree-based regression algorithms (forests, gradient-boosting) have become popular recently due to their ease of use and accuracy. In this paper, we explore the close connections between TAR models and regression trees. These enable us to use the rich methodology from the literature on TAR models to define a hierarchical TAR model as a regression tree that trains globally across series, which we call SETAR-Tree. In contrast to the general-purpose tree-based models that do not primarily focus on forecasting, and calculate averages at the leaf nodes, we introduce a new forecasting-specific tree algorithm that trains global Pooled Regression (PR) models in the leaves allowing the models to learn cross-series information and also uses some time-series-specific splitting and stopping procedures. The depth of the tree is controlled by conducting a statistical linearity test commonly employed in TAR models, as well as measuring the error reduction percentage at each node split. Thus, the proposed tree model requires minimal external hyperparameter tuning and provides competitive results under its default configuration. We also use this tree algorithm to develop a forest where the forecasts provided by a collection of diverse SETAR-Trees are combined during the forecasting process. In our evaluation on eight publicly available datasets, the proposed tree and forest models are able to achieve significantly higher accuracy than a set of state-of-the-art tree-based algorithms and forecasting benchmarks across four evaluation metrics.</abstract><cop>New York</cop><pub>Springer US</pub><doi>10.1007/s10994-023-06316-x</doi><tpages>37</tpages><orcidid>https://orcid.org/0000-0002-1333-7249</orcidid><oa>free_for_read</oa></addata></record> |
fulltext | fulltext |
identifier | ISSN: 0885-6125 |
ispartof | Machine learning, 2023-07, Vol.112 (7), p.2555-2591 |
issn | 0885-6125 1573-0565 |
language | eng |
recordid | cdi_proquest_journals_2837223078 |
source | Springer Link |
subjects | Accuracy Algorithms Artificial Intelligence Autoregressive models Computer Science Control Error analysis Error reduction Forecasting Linearity Machine Learning Mechatronics Natural Language Processing (NLP) Regression analysis Regression models Robotics Simulation and Modeling Special Issue of the ECML PKDD 2022 Journal Track Statistical analysis Time series |
title | SETAR-Tree: a novel and accurate tree algorithm for global time series forecasting |
url | http://sfxeu10.hosted.exlibrisgroup.com/loughborough?ctx_ver=Z39.88-2004&ctx_enc=info:ofi/enc:UTF-8&ctx_tim=2025-01-04T16%3A03%3A24IST&url_ver=Z39.88-2004&url_ctx_fmt=infofi/fmt:kev:mtx:ctx&rfr_id=info:sid/primo.exlibrisgroup.com:primo3-Article-proquest_cross&rft_val_fmt=info:ofi/fmt:kev:mtx:journal&rft.genre=article&rft.atitle=SETAR-Tree:%20a%20novel%20and%20accurate%20tree%20algorithm%20for%20global%20time%20series%20forecasting&rft.jtitle=Machine%20learning&rft.au=Godahewa,%20Rakshitha&rft.date=2023-07-01&rft.volume=112&rft.issue=7&rft.spage=2555&rft.epage=2591&rft.pages=2555-2591&rft.issn=0885-6125&rft.eissn=1573-0565&rft_id=info:doi/10.1007/s10994-023-06316-x&rft_dat=%3Cproquest_cross%3E2837223078%3C/proquest_cross%3E%3Cgrp_id%3Ecdi_FETCH-LOGICAL-c363t-e1c5386b04d67e132eca7e6a70f3904fdad0f17b9551554993ed7131379f6eda3%3C/grp_id%3E%3Coa%3E%3C/oa%3E%3Curl%3E%3C/url%3E&rft_id=info:oai/&rft_pqid=2837223078&rft_id=info:pmid/&rfr_iscdi=true |