Loading…

A K-variate Time Series Is Worth K Words: Evolution of the Vanilla Transformer Architecture for Long-term Multivariate Time Series Forecasting

Multivariate time series forecasting (MTSF) is a fundamental problem in numerous real-world applications. Recently, Transformer has become the de facto solution for MTSF, especially for the long-term cases. However, except for the one forward operation, the basic configurations in existing MTSF Tran...

Full description

Saved in:
Bibliographic Details
Published in:arXiv.org 2022-12
Main Authors: Zhou, Zanwei, Zhong, Ruizhe, Chen, Yang, Wang, Yan, Yang, Xiaokang, Shen, Wei
Format: Article
Language:English
Subjects:
Online Access:Get full text
Tags: Add Tag
No Tags, Be the first to tag this record!
cited_by
cites
container_end_page
container_issue
container_start_page
container_title arXiv.org
container_volume
creator Zhou, Zanwei
Zhong, Ruizhe
Chen, Yang
Wang, Yan
Yang, Xiaokang
Shen, Wei
description Multivariate time series forecasting (MTSF) is a fundamental problem in numerous real-world applications. Recently, Transformer has become the de facto solution for MTSF, especially for the long-term cases. However, except for the one forward operation, the basic configurations in existing MTSF Transformer architectures were barely carefully verified. In this study, we point out that the current tokenization strategy in MTSF Transformer architectures ignores the token uniformity inductive bias of Transformers. Therefore, the vanilla MTSF transformer struggles to capture details in time series and presents inferior performance. Based on this observation, we make a series of evolution on the basic architecture of the vanilla MTSF transformer. We vary the flawed tokenization strategy, along with the decoder structure and embeddings. Surprisingly, the evolved simple transformer architecture is highly effective, which successfully avoids the over-smoothing phenomena in the vanilla MTSF transformer, achieves a more detailed and accurate prediction, and even substantially outperforms the state-of-the-art Transformers that are well-designed for MTSF.
format article
fullrecord <record><control><sourceid>proquest</sourceid><recordid>TN_cdi_proquest_journals_2747737077</recordid><sourceformat>XML</sourceformat><sourcesystem>PC</sourcesystem><sourcerecordid>2747737077</sourcerecordid><originalsourceid>FETCH-proquest_journals_27477370773</originalsourceid><addsrcrecordid>eNqNjUFLw0AQhRdBsNT8hwHPgbhpXfFWpEWpngz1WJY4SbZsdnRmtj-jv7kpePTg5X3wvgfvysxsXd-Xjwtrb0whcqiqyj44u1zWM3NawbY8eg5eEZowInwgBxR4Ffgk1gG2F37JE6yPFLMGSkAd6ICw8ynE6KFhn6QjHpFhxe0QFFvNjDB18EapLxV5hPccNfx1tSHG1ouG1N-a685HweKXc3O3WTfPL-U3009G0f2BMqdJ7a1bOFe7aor_rc4k_lUW</addsrcrecordid><sourcetype>Aggregation Database</sourcetype><iscdi>true</iscdi><recordtype>article</recordtype><pqid>2747737077</pqid></control><display><type>article</type><title>A K-variate Time Series Is Worth K Words: Evolution of the Vanilla Transformer Architecture for Long-term Multivariate Time Series Forecasting</title><source>Publicly Available Content Database</source><creator>Zhou, Zanwei ; Zhong, Ruizhe ; Chen, Yang ; Wang, Yan ; Yang, Xiaokang ; Shen, Wei</creator><creatorcontrib>Zhou, Zanwei ; Zhong, Ruizhe ; Chen, Yang ; Wang, Yan ; Yang, Xiaokang ; Shen, Wei</creatorcontrib><description>Multivariate time series forecasting (MTSF) is a fundamental problem in numerous real-world applications. Recently, Transformer has become the de facto solution for MTSF, especially for the long-term cases. However, except for the one forward operation, the basic configurations in existing MTSF Transformer architectures were barely carefully verified. In this study, we point out that the current tokenization strategy in MTSF Transformer architectures ignores the token uniformity inductive bias of Transformers. Therefore, the vanilla MTSF transformer struggles to capture details in time series and presents inferior performance. Based on this observation, we make a series of evolution on the basic architecture of the vanilla MTSF transformer. We vary the flawed tokenization strategy, along with the decoder structure and embeddings. Surprisingly, the evolved simple transformer architecture is highly effective, which successfully avoids the over-smoothing phenomena in the vanilla MTSF transformer, achieves a more detailed and accurate prediction, and even substantially outperforms the state-of-the-art Transformers that are well-designed for MTSF.</description><identifier>EISSN: 2331-8422</identifier><language>eng</language><publisher>Ithaca: Cornell University Library, arXiv.org</publisher><subject>Evolution ; Forecasting ; Multivariate analysis ; Time series ; Vanilla</subject><ispartof>arXiv.org, 2022-12</ispartof><rights>2022. This work is published under http://arxiv.org/licenses/nonexclusive-distrib/1.0/ (the “License”). Notwithstanding the ProQuest Terms and Conditions, you may use this content in accordance with the terms of the License.</rights><oa>free_for_read</oa><woscitedreferencessubscribed>false</woscitedreferencessubscribed></display><links><openurl>$$Topenurl_article</openurl><openurlfulltext>$$Topenurlfull_article</openurlfulltext><thumbnail>$$Tsyndetics_thumb_exl</thumbnail><linktohtml>$$Uhttps://www.proquest.com/docview/2747737077?pq-origsite=primo$$EHTML$$P50$$Gproquest$$Hfree_for_read</linktohtml><link.rule.ids>780,784,25753,37012,44590</link.rule.ids></links><search><creatorcontrib>Zhou, Zanwei</creatorcontrib><creatorcontrib>Zhong, Ruizhe</creatorcontrib><creatorcontrib>Chen, Yang</creatorcontrib><creatorcontrib>Wang, Yan</creatorcontrib><creatorcontrib>Yang, Xiaokang</creatorcontrib><creatorcontrib>Shen, Wei</creatorcontrib><title>A K-variate Time Series Is Worth K Words: Evolution of the Vanilla Transformer Architecture for Long-term Multivariate Time Series Forecasting</title><title>arXiv.org</title><description>Multivariate time series forecasting (MTSF) is a fundamental problem in numerous real-world applications. Recently, Transformer has become the de facto solution for MTSF, especially for the long-term cases. However, except for the one forward operation, the basic configurations in existing MTSF Transformer architectures were barely carefully verified. In this study, we point out that the current tokenization strategy in MTSF Transformer architectures ignores the token uniformity inductive bias of Transformers. Therefore, the vanilla MTSF transformer struggles to capture details in time series and presents inferior performance. Based on this observation, we make a series of evolution on the basic architecture of the vanilla MTSF transformer. We vary the flawed tokenization strategy, along with the decoder structure and embeddings. Surprisingly, the evolved simple transformer architecture is highly effective, which successfully avoids the over-smoothing phenomena in the vanilla MTSF transformer, achieves a more detailed and accurate prediction, and even substantially outperforms the state-of-the-art Transformers that are well-designed for MTSF.</description><subject>Evolution</subject><subject>Forecasting</subject><subject>Multivariate analysis</subject><subject>Time series</subject><subject>Vanilla</subject><issn>2331-8422</issn><fulltext>true</fulltext><rsrctype>article</rsrctype><creationdate>2022</creationdate><recordtype>article</recordtype><sourceid>PIMPY</sourceid><recordid>eNqNjUFLw0AQhRdBsNT8hwHPgbhpXfFWpEWpngz1WJY4SbZsdnRmtj-jv7kpePTg5X3wvgfvysxsXd-Xjwtrb0whcqiqyj44u1zWM3NawbY8eg5eEZowInwgBxR4Ffgk1gG2F37JE6yPFLMGSkAd6ICw8ynE6KFhn6QjHpFhxe0QFFvNjDB18EapLxV5hPccNfx1tSHG1ouG1N-a685HweKXc3O3WTfPL-U3009G0f2BMqdJ7a1bOFe7aor_rc4k_lUW</recordid><startdate>20221206</startdate><enddate>20221206</enddate><creator>Zhou, Zanwei</creator><creator>Zhong, Ruizhe</creator><creator>Chen, Yang</creator><creator>Wang, Yan</creator><creator>Yang, Xiaokang</creator><creator>Shen, Wei</creator><general>Cornell University Library, arXiv.org</general><scope>8FE</scope><scope>8FG</scope><scope>ABJCF</scope><scope>ABUWG</scope><scope>AFKRA</scope><scope>AZQEC</scope><scope>BENPR</scope><scope>BGLVJ</scope><scope>CCPQU</scope><scope>DWQXO</scope><scope>HCIFZ</scope><scope>L6V</scope><scope>M7S</scope><scope>PIMPY</scope><scope>PQEST</scope><scope>PQQKQ</scope><scope>PQUKI</scope><scope>PRINS</scope><scope>PTHSS</scope></search><sort><creationdate>20221206</creationdate><title>A K-variate Time Series Is Worth K Words: Evolution of the Vanilla Transformer Architecture for Long-term Multivariate Time Series Forecasting</title><author>Zhou, Zanwei ; Zhong, Ruizhe ; Chen, Yang ; Wang, Yan ; Yang, Xiaokang ; Shen, Wei</author></sort><facets><frbrtype>5</frbrtype><frbrgroupid>cdi_FETCH-proquest_journals_27477370773</frbrgroupid><rsrctype>articles</rsrctype><prefilter>articles</prefilter><language>eng</language><creationdate>2022</creationdate><topic>Evolution</topic><topic>Forecasting</topic><topic>Multivariate analysis</topic><topic>Time series</topic><topic>Vanilla</topic><toplevel>online_resources</toplevel><creatorcontrib>Zhou, Zanwei</creatorcontrib><creatorcontrib>Zhong, Ruizhe</creatorcontrib><creatorcontrib>Chen, Yang</creatorcontrib><creatorcontrib>Wang, Yan</creatorcontrib><creatorcontrib>Yang, Xiaokang</creatorcontrib><creatorcontrib>Shen, Wei</creatorcontrib><collection>ProQuest SciTech Collection</collection><collection>ProQuest Technology Collection</collection><collection>Materials Science &amp; Engineering Collection</collection><collection>ProQuest Central (Alumni)</collection><collection>ProQuest Central</collection><collection>ProQuest Central Essentials</collection><collection>ProQuest Central</collection><collection>Technology Collection</collection><collection>ProQuest One Community College</collection><collection>ProQuest Central Korea</collection><collection>SciTech Premium Collection</collection><collection>ProQuest Engineering Collection</collection><collection>Engineering Database</collection><collection>Publicly Available Content Database</collection><collection>ProQuest One Academic Eastern Edition (DO NOT USE)</collection><collection>ProQuest One Academic</collection><collection>ProQuest One Academic UKI Edition</collection><collection>ProQuest Central China</collection><collection>Engineering Collection</collection></facets><delivery><delcategory>Remote Search Resource</delcategory><fulltext>fulltext</fulltext></delivery><addata><au>Zhou, Zanwei</au><au>Zhong, Ruizhe</au><au>Chen, Yang</au><au>Wang, Yan</au><au>Yang, Xiaokang</au><au>Shen, Wei</au><format>book</format><genre>document</genre><ristype>GEN</ristype><atitle>A K-variate Time Series Is Worth K Words: Evolution of the Vanilla Transformer Architecture for Long-term Multivariate Time Series Forecasting</atitle><jtitle>arXiv.org</jtitle><date>2022-12-06</date><risdate>2022</risdate><eissn>2331-8422</eissn><abstract>Multivariate time series forecasting (MTSF) is a fundamental problem in numerous real-world applications. Recently, Transformer has become the de facto solution for MTSF, especially for the long-term cases. However, except for the one forward operation, the basic configurations in existing MTSF Transformer architectures were barely carefully verified. In this study, we point out that the current tokenization strategy in MTSF Transformer architectures ignores the token uniformity inductive bias of Transformers. Therefore, the vanilla MTSF transformer struggles to capture details in time series and presents inferior performance. Based on this observation, we make a series of evolution on the basic architecture of the vanilla MTSF transformer. We vary the flawed tokenization strategy, along with the decoder structure and embeddings. Surprisingly, the evolved simple transformer architecture is highly effective, which successfully avoids the over-smoothing phenomena in the vanilla MTSF transformer, achieves a more detailed and accurate prediction, and even substantially outperforms the state-of-the-art Transformers that are well-designed for MTSF.</abstract><cop>Ithaca</cop><pub>Cornell University Library, arXiv.org</pub><oa>free_for_read</oa></addata></record>
fulltext fulltext
identifier EISSN: 2331-8422
ispartof arXiv.org, 2022-12
issn 2331-8422
language eng
recordid cdi_proquest_journals_2747737077
source Publicly Available Content Database
subjects Evolution
Forecasting
Multivariate analysis
Time series
Vanilla
title A K-variate Time Series Is Worth K Words: Evolution of the Vanilla Transformer Architecture for Long-term Multivariate Time Series Forecasting
url http://sfxeu10.hosted.exlibrisgroup.com/loughborough?ctx_ver=Z39.88-2004&ctx_enc=info:ofi/enc:UTF-8&ctx_tim=2024-12-29T20%3A16%3A45IST&url_ver=Z39.88-2004&url_ctx_fmt=infofi/fmt:kev:mtx:ctx&rfr_id=info:sid/primo.exlibrisgroup.com:primo3-Article-proquest&rft_val_fmt=info:ofi/fmt:kev:mtx:book&rft.genre=document&rft.atitle=A%20K-variate%20Time%20Series%20Is%20Worth%20K%20Words:%20Evolution%20of%20the%20Vanilla%20Transformer%20Architecture%20for%20Long-term%20Multivariate%20Time%20Series%20Forecasting&rft.jtitle=arXiv.org&rft.au=Zhou,%20Zanwei&rft.date=2022-12-06&rft.eissn=2331-8422&rft_id=info:doi/&rft_dat=%3Cproquest%3E2747737077%3C/proquest%3E%3Cgrp_id%3Ecdi_FETCH-proquest_journals_27477370773%3C/grp_id%3E%3Coa%3E%3C/oa%3E%3Curl%3E%3C/url%3E&rft_id=info:oai/&rft_pqid=2747737077&rft_id=info:pmid/&rfr_iscdi=true