Loading…

SETAR-Tree: a novel and accurate tree algorithm for global time series forecasting

Threshold Autoregressive (TAR) models have been widely used by statisticians for non-linear time series forecasting during the past few decades, due to their simplicity and mathematical properties. On the other hand, in the forecasting community, general-purpose tree-based regression algorithms (for...

Full description

Saved in:

Bibliographic Details
Published in:	Machine learning 2023-07, Vol.112 (7), p.2555-2591
Main Authors:	Godahewa, Rakshitha, Webb, Geoffrey I., Schmidt, Daniel, Bergmeir, Christoph
Format:	Article
Language:	English
Subjects:	Accuracy Algorithms Artificial Intelligence Autoregressive models Computer Science Control Error analysis Error reduction Forecasting Linearity Machine Learning Mechatronics Natural Language Processing (NLP) Regression analysis Regression models Robotics Simulation and Modeling Special Issue of the ECML PKDD 2022 Journal Track Statistical analysis Time series
Citations:	Items that this one cites Items that cite this one
Online Access:	Get full text
Tags:	Add Tag No Tags, Be the first to tag this record!

cited_by	cdi_FETCH-LOGICAL-c363t-e1c5386b04d67e132eca7e6a70f3904fdad0f17b9551554993ed7131379f6eda3
cites	cdi_FETCH-LOGICAL-c363t-e1c5386b04d67e132eca7e6a70f3904fdad0f17b9551554993ed7131379f6eda3
container_end_page	2591
container_issue	7
container_start_page	2555
container_title	Machine learning
container_volume	112
creator	Godahewa, Rakshitha Webb, Geoffrey I. Schmidt, Daniel Bergmeir, Christoph
description	Threshold Autoregressive (TAR) models have been widely used by statisticians for non-linear time series forecasting during the past few decades, due to their simplicity and mathematical properties. On the other hand, in the forecasting community, general-purpose tree-based regression algorithms (forests, gradient-boosting) have become popular recently due to their ease of use and accuracy. In this paper, we explore the close connections between TAR models and regression trees. These enable us to use the rich methodology from the literature on TAR models to define a hierarchical TAR model as a regression tree that trains globally across series, which we call SETAR-Tree. In contrast to the general-purpose tree-based models that do not primarily focus on forecasting, and calculate averages at the leaf nodes, we introduce a new forecasting-specific tree algorithm that trains global Pooled Regression (PR) models in the leaves allowing the models to learn cross-series information and also uses some time-series-specific splitting and stopping procedures. The depth of the tree is controlled by conducting a statistical linearity test commonly employed in TAR models, as well as measuring the error reduction percentage at each node split. Thus, the proposed tree model requires minimal external hyperparameter tuning and provides competitive results under its default configuration. We also use this tree algorithm to develop a forest where the forecasts provided by a collection of diverse SETAR-Trees are combined during the forecasting process. In our evaluation on eight publicly available datasets, the proposed tree and forest models are able to achieve significantly higher accuracy than a set of state-of-the-art tree-based algorithms and forecasting benchmarks across four evaluation metrics.
doi_str_mv	10.1007/s10994-023-06316-x
format	article
fullrecord	<record><control><sourceid>proquest_cross</sourceid><recordid>TN_cdi_proquest_journals_2837223078</recordid><sourceformat>XML</sourceformat><sourcesystem>PC</sourcesystem><sourcerecordid>2837223078</sourcerecordid><originalsourceid>FETCH-LOGICAL-c363t-e1c5386b04d67e132eca7e6a70f3904fdad0f17b9551554993ed7131379f6eda3</originalsourceid><addsrcrecordid>eNp9kN9LwzAQx4MoOKf_gE8Bn6OXpEla38aYP2AgzPkcsvZaO7p2Jq3M_97MCr75dNzd93MHH0KuOdxyAHMXOGRZwkBIBlpyzQ4nZMKVia3S6pRMIE0V01yoc3IRwhYAhE71hKxeF-vZiq094j11tO0-saGuLajL88G7HmkfV9Q1Vefr_n1Hy87Tquk2rqF9vUMa0NcYjmPMXejrtrokZ6VrAl791il5e1is509s-fL4PJ8tWS617BnyXMlUbyAptEEuReQNameglBkkZeEKKLnZZEpxpZIsk1gYLrk0WamxcHJKbsa7e999DBh6u-0G38aXVqTSCCHBpDElxlTuuxA8lnbv653zX5aDPbqzozsb3dkfd_YQITlCIYbbCv3f6X-ob1K7cXk</addsrcrecordid><sourcetype>Aggregation Database</sourcetype><iscdi>true</iscdi><recordtype>article</recordtype><pqid>2837223078</pqid></control><display><type>article</type><title>SETAR-Tree: a novel and accurate tree algorithm for global time series forecasting</title><source>Springer Link</source><creator>Godahewa, Rakshitha ; Webb, Geoffrey I. ; Schmidt, Daniel ; Bergmeir, Christoph</creator><creatorcontrib>Godahewa, Rakshitha ; Webb, Geoffrey I. ; Schmidt, Daniel ; Bergmeir, Christoph</creatorcontrib><description>Threshold Autoregressive (TAR) models have been widely used by statisticians for non-linear time series forecasting during the past few decades, due to their simplicity and mathematical properties. On the other hand, in the forecasting community, general-purpose tree-based regression algorithms (forests, gradient-boosting) have become popular recently due to their ease of use and accuracy. In this paper, we explore the close connections between TAR models and regression trees. These enable us to use the rich methodology from the literature on TAR models to define a hierarchical TAR model as a regression tree that trains globally across series, which we call SETAR-Tree. In contrast to the general-purpose tree-based models that do not primarily focus on forecasting, and calculate averages at the leaf nodes, we introduce a new forecasting-specific tree algorithm that trains global Pooled Regression (PR) models in the leaves allowing the models to learn cross-series information and also uses some time-series-specific splitting and stopping procedures. The depth of the tree is controlled by conducting a statistical linearity test commonly employed in TAR models, as well as measuring the error reduction percentage at each node split. Thus, the proposed tree model requires minimal external hyperparameter tuning and provides competitive results under its default configuration. We also use this tree algorithm to develop a forest where the forecasts provided by a collection of diverse SETAR-Trees are combined during the forecasting process. In our evaluation on eight publicly available datasets, the proposed tree and forest models are able to achieve significantly higher accuracy than a set of state-of-the-art tree-based algorithms and forecasting benchmarks across four evaluation metrics.</description><identifier>ISSN: 0885-6125</identifier><identifier>EISSN: 1573-0565</identifier><identifier>DOI: 10.1007/s10994-023-06316-x</identifier><language>eng</language><publisher>New York: Springer US</publisher><subject>Accuracy ; Algorithms ; Artificial Intelligence ; Autoregressive models ; Computer Science ; Control ; Error analysis ; Error reduction ; Forecasting ; Linearity ; Machine Learning ; Mechatronics ; Natural Language Processing (NLP) ; Regression analysis ; Regression models ; Robotics ; Simulation and Modeling ; Special Issue of the ECML PKDD 2022 Journal Track ; Statistical analysis ; Time series</subject><ispartof>Machine learning, 2023-07, Vol.112 (7), p.2555-2591</ispartof><rights>The Author(s) 2023</rights><rights>The Author(s) 2023. This work is published under http://creativecommons.org/licenses/by/4.0/ (the “License”). Notwithstanding the ProQuest Terms and Conditions, you may use this content in accordance with the terms of the License.</rights><lds50>peer_reviewed</lds50><oa>free_for_read</oa><woscitedreferencessubscribed>false</woscitedreferencessubscribed><citedby>FETCH-LOGICAL-c363t-e1c5386b04d67e132eca7e6a70f3904fdad0f17b9551554993ed7131379f6eda3</citedby><cites>FETCH-LOGICAL-c363t-e1c5386b04d67e132eca7e6a70f3904fdad0f17b9551554993ed7131379f6eda3</cites><orcidid>0000-0002-1333-7249</orcidid></display><links><openurl>$$Topenurl_article</openurl><openurlfulltext>$$Topenurlfull_article</openurlfulltext><thumbnail>$$Tsyndetics_thumb_exl</thumbnail><link.rule.ids>314,780,784,27924,27925</link.rule.ids></links><search><creatorcontrib>Godahewa, Rakshitha</creatorcontrib><creatorcontrib>Webb, Geoffrey I.</creatorcontrib><creatorcontrib>Schmidt, Daniel</creatorcontrib><creatorcontrib>Bergmeir, Christoph</creatorcontrib><title>SETAR-Tree: a novel and accurate tree algorithm for global time series forecasting</title><title>Machine learning</title><addtitle>Mach Learn</addtitle><description>Threshold Autoregressive (TAR) models have been widely used by statisticians for non-linear time series forecasting during the past few decades, due to their simplicity and mathematical properties. On the other hand, in the forecasting community, general-purpose tree-based regression algorithms (forests, gradient-boosting) have become popular recently due to their ease of use and accuracy. In this paper, we explore the close connections between TAR models and regression trees. These enable us to use the rich methodology from the literature on TAR models to define a hierarchical TAR model as a regression tree that trains globally across series, which we call SETAR-Tree. In contrast to the general-purpose tree-based models that do not primarily focus on forecasting, and calculate averages at the leaf nodes, we introduce a new forecasting-specific tree algorithm that trains global Pooled Regression (PR) models in the leaves allowing the models to learn cross-series information and also uses some time-series-specific splitting and stopping procedures. The depth of the tree is controlled by conducting a statistical linearity test commonly employed in TAR models, as well as measuring the error reduction percentage at each node split. Thus, the proposed tree model requires minimal external hyperparameter tuning and provides competitive results under its default configuration. We also use this tree algorithm to develop a forest where the forecasts provided by a collection of diverse SETAR-Trees are combined during the forecasting process. In our evaluation on eight publicly available datasets, the proposed tree and forest models are able to achieve significantly higher accuracy than a set of state-of-the-art tree-based algorithms and forecasting benchmarks across four evaluation metrics.</description><subject>Accuracy</subject><subject>Algorithms</subject><subject>Artificial Intelligence</subject><subject>Autoregressive models</subject><subject>Computer Science</subject><subject>Control</subject><subject>Error analysis</subject><subject>Error reduction</subject><subject>Forecasting</subject><subject>Linearity</subject><subject>Machine Learning</subject><subject>Mechatronics</subject><subject>Natural Language Processing (NLP)</subject><subject>Regression analysis</subject><subject>Regression models</subject><subject>Robotics</subject><subject>Simulation and Modeling</subject><subject>Special Issue of the ECML PKDD 2022 Journal Track</subject><subject>Statistical analysis</subject><subject>Time series</subject><issn>0885-6125</issn><issn>1573-0565</issn><fulltext>true</fulltext><rsrctype>article</rsrctype><creationdate>2023</creationdate><recordtype>article</recordtype><recordid>eNp9kN9LwzAQx4MoOKf_gE8Bn6OXpEla38aYP2AgzPkcsvZaO7p2Jq3M_97MCr75dNzd93MHH0KuOdxyAHMXOGRZwkBIBlpyzQ4nZMKVia3S6pRMIE0V01yoc3IRwhYAhE71hKxeF-vZiq094j11tO0-saGuLajL88G7HmkfV9Q1Vefr_n1Hy87Tquk2rqF9vUMa0NcYjmPMXejrtrokZ6VrAl791il5e1is509s-fL4PJ8tWS617BnyXMlUbyAptEEuReQNameglBkkZeEKKLnZZEpxpZIsk1gYLrk0WamxcHJKbsa7e999DBh6u-0G38aXVqTSCCHBpDElxlTuuxA8lnbv653zX5aDPbqzozsb3dkfd_YQITlCIYbbCv3f6X-ob1K7cXk</recordid><startdate>20230701</startdate><enddate>20230701</enddate><creator>Godahewa, Rakshitha</creator><creator>Webb, Geoffrey I.</creator><creator>Schmidt, Daniel</creator><creator>Bergmeir, Christoph</creator><general>Springer US</general><general>Springer Nature B.V</general><scope>C6C</scope><scope>AAYXX</scope><scope>CITATION</scope><scope>3V.</scope><scope>7SC</scope><scope>7XB</scope><scope>88I</scope><scope>8AL</scope><scope>8AO</scope><scope>8FD</scope><scope>8FE</scope><scope>8FG</scope><scope>8FK</scope><scope>ABUWG</scope><scope>AFKRA</scope><scope>ARAPS</scope><scope>AZQEC</scope><scope>BENPR</scope><scope>BGLVJ</scope><scope>CCPQU</scope><scope>DWQXO</scope><scope>GNUQQ</scope><scope>HCIFZ</scope><scope>JQ2</scope><scope>K7-</scope><scope>L7M</scope><scope>L~C</scope><scope>L~D</scope><scope>M0N</scope><scope>M2P</scope><scope>P5Z</scope><scope>P62</scope><scope>PQEST</scope><scope>PQQKQ</scope><scope>PQUKI</scope><scope>Q9U</scope><orcidid>https://orcid.org/0000-0002-1333-7249</orcidid></search><sort><creationdate>20230701</creationdate><title>SETAR-Tree: a novel and accurate tree algorithm for global time series forecasting</title><author>Godahewa, Rakshitha ; Webb, Geoffrey I. ; Schmidt, Daniel ; Bergmeir, Christoph</author></sort><facets><frbrtype>5</frbrtype><frbrgroupid>cdi_FETCH-LOGICAL-c363t-e1c5386b04d67e132eca7e6a70f3904fdad0f17b9551554993ed7131379f6eda3</frbrgroupid><rsrctype>articles</rsrctype><prefilter>articles</prefilter><language>eng</language><creationdate>2023</creationdate><topic>Accuracy</topic><topic>Algorithms</topic><topic>Artificial Intelligence</topic><topic>Autoregressive models</topic><topic>Computer Science</topic><topic>Control</topic><topic>Error analysis</topic><topic>Error reduction</topic><topic>Forecasting</topic><topic>Linearity</topic><topic>Machine Learning</topic><topic>Mechatronics</topic><topic>Natural Language Processing (NLP)</topic><topic>Regression analysis</topic><topic>Regression models</topic><topic>Robotics</topic><topic>Simulation and Modeling</topic><topic>Special Issue of the ECML PKDD 2022 Journal Track</topic><topic>Statistical analysis</topic><topic>Time series</topic><toplevel>peer_reviewed</toplevel><toplevel>online_resources</toplevel><creatorcontrib>Godahewa, Rakshitha</creatorcontrib><creatorcontrib>Webb, Geoffrey I.</creatorcontrib><creatorcontrib>Schmidt, Daniel</creatorcontrib><creatorcontrib>Bergmeir, Christoph</creatorcontrib><collection>Springer Nature OA Free Journals</collection><collection>CrossRef</collection><collection>ProQuest Central (Corporate)</collection><collection>Computer and Information Systems Abstracts</collection><collection>ProQuest Central (purchase pre-March 2016)</collection><collection>Science Database (Alumni Edition)</collection><collection>Computing Database (Alumni Edition)</collection><collection>ProQuest Pharma Collection</collection><collection>Technology Research Database</collection><collection>ProQuest SciTech Collection</collection><collection>ProQuest Technology Collection</collection><collection>ProQuest Central (Alumni) (purchase pre-March 2016)</collection><collection>ProQuest Central (Alumni)</collection><collection>ProQuest Central</collection><collection>Advanced Technologies & Aerospace Collection</collection><collection>ProQuest Central Essentials</collection><collection>AUTh Library subscriptions: ProQuest Central</collection><collection>Technology Collection</collection><collection>ProQuest One Community College</collection><collection>ProQuest Central Korea</collection><collection>ProQuest Central Student</collection><collection>SciTech Premium Collection</collection><collection>ProQuest Computer Science Collection</collection><collection>Computer Science Database</collection><collection>Advanced Technologies Database with Aerospace</collection><collection>Computer and Information Systems Abstracts Academic</collection><collection>Computer and Information Systems Abstracts Professional</collection><collection>Computing Database</collection><collection>Science Database</collection><collection>Advanced Technologies & Aerospace Database</collection><collection>ProQuest Advanced Technologies & Aerospace Collection</collection><collection>ProQuest One Academic Eastern Edition (DO NOT USE)</collection><collection>ProQuest One Academic</collection><collection>ProQuest One Academic UKI Edition</collection><collection>ProQuest Central Basic</collection><jtitle>Machine learning</jtitle></facets><delivery><delcategory>Remote Search Resource</delcategory><fulltext>fulltext</fulltext></delivery><addata><au>Godahewa, Rakshitha</au><au>Webb, Geoffrey I.</au><au>Schmidt, Daniel</au><au>Bergmeir, Christoph</au><format>journal</format><genre>article</genre><ristype>JOUR</ristype><atitle>SETAR-Tree: a novel and accurate tree algorithm for global time series forecasting</atitle><jtitle>Machine learning</jtitle><stitle>Mach Learn</stitle><date>2023-07-01</date><risdate>2023</risdate><volume>112</volume><issue>7</issue><spage>2555</spage><epage>2591</epage><pages>2555-2591</pages><issn>0885-6125</issn><eissn>1573-0565</eissn><abstract>Threshold Autoregressive (TAR) models have been widely used by statisticians for non-linear time series forecasting during the past few decades, due to their simplicity and mathematical properties. On the other hand, in the forecasting community, general-purpose tree-based regression algorithms (forests, gradient-boosting) have become popular recently due to their ease of use and accuracy. In this paper, we explore the close connections between TAR models and regression trees. These enable us to use the rich methodology from the literature on TAR models to define a hierarchical TAR model as a regression tree that trains globally across series, which we call SETAR-Tree. In contrast to the general-purpose tree-based models that do not primarily focus on forecasting, and calculate averages at the leaf nodes, we introduce a new forecasting-specific tree algorithm that trains global Pooled Regression (PR) models in the leaves allowing the models to learn cross-series information and also uses some time-series-specific splitting and stopping procedures. The depth of the tree is controlled by conducting a statistical linearity test commonly employed in TAR models, as well as measuring the error reduction percentage at each node split. Thus, the proposed tree model requires minimal external hyperparameter tuning and provides competitive results under its default configuration. We also use this tree algorithm to develop a forest where the forecasts provided by a collection of diverse SETAR-Trees are combined during the forecasting process. In our evaluation on eight publicly available datasets, the proposed tree and forest models are able to achieve significantly higher accuracy than a set of state-of-the-art tree-based algorithms and forecasting benchmarks across four evaluation metrics.</abstract><cop>New York</cop><pub>Springer US</pub><doi>10.1007/s10994-023-06316-x</doi><tpages>37</tpages><orcidid>https://orcid.org/0000-0002-1333-7249</orcidid><oa>free_for_read</oa></addata></record>
fulltext	fulltext
identifier	ISSN: 0885-6125
ispartof	Machine learning, 2023-07, Vol.112 (7), p.2555-2591
issn	0885-6125 1573-0565
language	eng
recordid	cdi_proquest_journals_2837223078
source	Springer Link
subjects	Accuracy Algorithms Artificial Intelligence Autoregressive models Computer Science Control Error analysis Error reduction Forecasting Linearity Machine Learning Mechatronics Natural Language Processing (NLP) Regression analysis Regression models Robotics Simulation and Modeling Special Issue of the ECML PKDD 2022 Journal Track Statistical analysis Time series
title	SETAR-Tree: a novel and accurate tree algorithm for global time series forecasting
url	http://sfxeu10.hosted.exlibrisgroup.com/loughborough?ctx_ver=Z39.88-2004&ctx_enc=info:ofi/enc:UTF-8&ctx_tim=2025-01-04T16%3A03%3A24IST&url_ver=Z39.88-2004&url_ctx_fmt=infofi/fmt:kev:mtx:ctx&rfr_id=info:sid/primo.exlibrisgroup.com:primo3-Article-proquest_cross&rft_val_fmt=info:ofi/fmt:kev:mtx:journal&rft.genre=article&rft.atitle=SETAR-Tree:%20a%20novel%20and%20accurate%20tree%20algorithm%20for%20global%20time%20series%20forecasting&rft.jtitle=Machine%20learning&rft.au=Godahewa,%20Rakshitha&rft.date=2023-07-01&rft.volume=112&rft.issue=7&rft.spage=2555&rft.epage=2591&rft.pages=2555-2591&rft.issn=0885-6125&rft.eissn=1573-0565&rft_id=info:doi/10.1007/s10994-023-06316-x&rft_dat=%3Cproquest_cross%3E2837223078%3C/proquest_cross%3E%3Cgrp_id%3Ecdi_FETCH-LOGICAL-c363t-e1c5386b04d67e132eca7e6a70f3904fdad0f17b9551554993ed7131379f6eda3%3C/grp_id%3E%3Coa%3E%3C/oa%3E%3Curl%3E%3C/url%3E&rft_id=info:oai/&rft_pqid=2837223078&rft_id=info:pmid/&rfr_iscdi=true