Loading…

A K-variate Time Series Is Worth K Words: Evolution of the Vanilla Transformer Architecture for Long-term Multivariate Time Series Forecasting

Multivariate time series forecasting (MTSF) is a fundamental problem in numerous real-world applications. Recently, Transformer has become the de facto solution for MTSF, especially for the long-term cases. However, except for the one forward operation, the basic configurations in existing MTSF Tran...

Full description

Saved in:

Bibliographic Details
Published in:	arXiv.org 2022-12
Main Authors:	Zhou, Zanwei, Zhong, Ruizhe, Chen, Yang, Wang, Yan, Yang, Xiaokang, Shen, Wei
Format:	Article
Language:	English
Subjects:	Evolution Forecasting Multivariate analysis Time series Vanilla
Online Access:	Get full text
Tags:	Add Tag No Tags, Be the first to tag this record!

cited_by
cites
container_end_page
container_issue
container_start_page
container_title	arXiv.org
container_volume
creator	Zhou, Zanwei Zhong, Ruizhe Chen, Yang Wang, Yan Yang, Xiaokang Shen, Wei
description	Multivariate time series forecasting (MTSF) is a fundamental problem in numerous real-world applications. Recently, Transformer has become the de facto solution for MTSF, especially for the long-term cases. However, except for the one forward operation, the basic configurations in existing MTSF Transformer architectures were barely carefully verified. In this study, we point out that the current tokenization strategy in MTSF Transformer architectures ignores the token uniformity inductive bias of Transformers. Therefore, the vanilla MTSF transformer struggles to capture details in time series and presents inferior performance. Based on this observation, we make a series of evolution on the basic architecture of the vanilla MTSF transformer. We vary the flawed tokenization strategy, along with the decoder structure and embeddings. Surprisingly, the evolved simple transformer architecture is highly effective, which successfully avoids the over-smoothing phenomena in the vanilla MTSF transformer, achieves a more detailed and accurate prediction, and even substantially outperforms the state-of-the-art Transformers that are well-designed for MTSF.
format	article
fullrecord	<record><control><sourceid>proquest</sourceid><recordid>TN_cdi_proquest_journals_2747737077</recordid><sourceformat>XML</sourceformat><sourcesystem>PC</sourcesystem><sourcerecordid>2747737077</sourcerecordid><originalsourceid>FETCH-proquest_journals_27477370773</originalsourceid><addsrcrecordid>eNqNjUFLw0AQhRdBsNT8hwHPgbhpXfFWpEWpngz1WJY4SbZsdnRmtj-jv7kpePTg5X3wvgfvysxsXd-Xjwtrb0whcqiqyj44u1zWM3NawbY8eg5eEZowInwgBxR4Ffgk1gG2F37JE6yPFLMGSkAd6ICw8ynE6KFhn6QjHpFhxe0QFFvNjDB18EapLxV5hPccNfx1tSHG1ouG1N-a685HweKXc3O3WTfPL-U3009G0f2BMqdJ7a1bOFe7aor_rc4k_lUW</addsrcrecordid><sourcetype>Aggregation Database</sourcetype><iscdi>true</iscdi><recordtype>article</recordtype><pqid>2747737077</pqid></control><display><type>article</type><title>A K-variate Time Series Is Worth K Words: Evolution of the Vanilla Transformer Architecture for Long-term Multivariate Time Series Forecasting</title><source>Publicly Available Content Database</source><creator>Zhou, Zanwei ; Zhong, Ruizhe ; Chen, Yang ; Wang, Yan ; Yang, Xiaokang ; Shen, Wei</creator><creatorcontrib>Zhou, Zanwei ; Zhong, Ruizhe ; Chen, Yang ; Wang, Yan ; Yang, Xiaokang ; Shen, Wei</creatorcontrib><description>Multivariate time series forecasting (MTSF) is a fundamental problem in numerous real-world applications. Recently, Transformer has become the de facto solution for MTSF, especially for the long-term cases. However, except for the one forward operation, the basic configurations in existing MTSF Transformer architectures were barely carefully verified. In this study, we point out that the current tokenization strategy in MTSF Transformer architectures ignores the token uniformity inductive bias of Transformers. Therefore, the vanilla MTSF transformer struggles to capture details in time series and presents inferior performance. Based on this observation, we make a series of evolution on the basic architecture of the vanilla MTSF transformer. We vary the flawed tokenization strategy, along with the decoder structure and embeddings. Surprisingly, the evolved simple transformer architecture is highly effective, which successfully avoids the over-smoothing phenomena in the vanilla MTSF transformer, achieves a more detailed and accurate prediction, and even substantially outperforms the state-of-the-art Transformers that are well-designed for MTSF.</description><identifier>EISSN: 2331-8422</identifier><language>eng</language><publisher>Ithaca: Cornell University Library, arXiv.org</publisher><subject>Evolution ; Forecasting ; Multivariate analysis ; Time series ; Vanilla</subject><ispartof>arXiv.org, 2022-12</ispartof><rights>2022. This work is published under http://arxiv.org/licenses/nonexclusive-distrib/1.0/ (the “License”). Notwithstanding the ProQuest Terms and Conditions, you may use this content in accordance with the terms of the License.</rights><oa>free_for_read</oa><woscitedreferencessubscribed>false</woscitedreferencessubscribed></display><links><openurl>$$Topenurl_article</openurl><openurlfulltext>$$Topenurlfull_article</openurlfulltext><thumbnail>$$Tsyndetics_thumb_exl</thumbnail><linktohtml>$$Uhttps://www.proquest.com/docview/2747737077?pq-origsite=primo$$EHTML$$P50$$Gproquest$$Hfree_for_read</linktohtml><link.rule.ids>780,784,25753,37012,44590</link.rule.ids></links><search><creatorcontrib>Zhou, Zanwei</creatorcontrib><creatorcontrib>Zhong, Ruizhe</creatorcontrib><creatorcontrib>Chen, Yang</creatorcontrib><creatorcontrib>Wang, Yan</creatorcontrib><creatorcontrib>Yang, Xiaokang</creatorcontrib><creatorcontrib>Shen, Wei</creatorcontrib><title>A K-variate Time Series Is Worth K Words: Evolution of the Vanilla Transformer Architecture for Long-term Multivariate Time Series Forecasting</title><title>arXiv.org</title><description>Multivariate time series forecasting (MTSF) is a fundamental problem in numerous real-world applications. Recently, Transformer has become the de facto solution for MTSF, especially for the long-term cases. However, except for the one forward operation, the basic configurations in existing MTSF Transformer architectures were barely carefully verified. In this study, we point out that the current tokenization strategy in MTSF Transformer architectures ignores the token uniformity inductive bias of Transformers. Therefore, the vanilla MTSF transformer struggles to capture details in time series and presents inferior performance. Based on this observation, we make a series of evolution on the basic architecture of the vanilla MTSF transformer. We vary the flawed tokenization strategy, along with the decoder structure and embeddings. Surprisingly, the evolved simple transformer architecture is highly effective, which successfully avoids the over-smoothing phenomena in the vanilla MTSF transformer, achieves a more detailed and accurate prediction, and even substantially outperforms the state-of-the-art Transformers that are well-designed for MTSF.</description><subject>Evolution</subject><subject>Forecasting</subject><subject>Multivariate analysis</subject><subject>Time series</subject><subject>Vanilla</subject><issn>2331-8422</issn><fulltext>true</fulltext><rsrctype>article</rsrctype><creationdate>2022</creationdate><recordtype>article</recordtype><sourceid>PIMPY</sourceid><recordid>eNqNjUFLw0AQhRdBsNT8hwHPgbhpXfFWpEWpngz1WJY4SbZsdnRmtj-jv7kpePTg5X3wvgfvysxsXd-Xjwtrb0whcqiqyj44u1zWM3NawbY8eg5eEZowInwgBxR4Ffgk1gG2F37JE6yPFLMGSkAd6ICw8ynE6KFhn6QjHpFhxe0QFFvNjDB18EapLxV5hPccNfx1tSHG1ouG1N-a685HweKXc3O3WTfPL-U3009G0f2BMqdJ7a1bOFe7aor_rc4k_lUW</recordid><startdate>20221206</startdate><enddate>20221206</enddate><creator>Zhou, Zanwei</creator><creator>Zhong, Ruizhe</creator><creator>Chen, Yang</creator><creator>Wang, Yan</creator><creator>Yang, Xiaokang</creator><creator>Shen, Wei</creator><general>Cornell University Library, arXiv.org</general><scope>8FE</scope><scope>8FG</scope><scope>ABJCF</scope><scope>ABUWG</scope><scope>AFKRA</scope><scope>AZQEC</scope><scope>BENPR</scope><scope>BGLVJ</scope><scope>CCPQU</scope><scope>DWQXO</scope><scope>HCIFZ</scope><scope>L6V</scope><scope>M7S</scope><scope>PIMPY</scope><scope>PQEST</scope><scope>PQQKQ</scope><scope>PQUKI</scope><scope>PRINS</scope><scope>PTHSS</scope></search><sort><creationdate>20221206</creationdate><title>A K-variate Time Series Is Worth K Words: Evolution of the Vanilla Transformer Architecture for Long-term Multivariate Time Series Forecasting</title><author>Zhou, Zanwei ; Zhong, Ruizhe ; Chen, Yang ; Wang, Yan ; Yang, Xiaokang ; Shen, Wei</author></sort><facets><frbrtype>5</frbrtype><frbrgroupid>cdi_FETCH-proquest_journals_27477370773</frbrgroupid><rsrctype>articles</rsrctype><prefilter>articles</prefilter><language>eng</language><creationdate>2022</creationdate><topic>Evolution</topic><topic>Forecasting</topic><topic>Multivariate analysis</topic><topic>Time series</topic><topic>Vanilla</topic><toplevel>online_resources</toplevel><creatorcontrib>Zhou, Zanwei</creatorcontrib><creatorcontrib>Zhong, Ruizhe</creatorcontrib><creatorcontrib>Chen, Yang</creatorcontrib><creatorcontrib>Wang, Yan</creatorcontrib><creatorcontrib>Yang, Xiaokang</creatorcontrib><creatorcontrib>Shen, Wei</creatorcontrib><collection>ProQuest SciTech Collection</collection><collection>ProQuest Technology Collection</collection><collection>Materials Science & Engineering Collection</collection><collection>ProQuest Central (Alumni)</collection><collection>ProQuest Central</collection><collection>ProQuest Central Essentials</collection><collection>ProQuest Central</collection><collection>Technology Collection</collection><collection>ProQuest One Community College</collection><collection>ProQuest Central Korea</collection><collection>SciTech Premium Collection</collection><collection>ProQuest Engineering Collection</collection><collection>Engineering Database</collection><collection>Publicly Available Content Database</collection><collection>ProQuest One Academic Eastern Edition (DO NOT USE)</collection><collection>ProQuest One Academic</collection><collection>ProQuest One Academic UKI Edition</collection><collection>ProQuest Central China</collection><collection>Engineering Collection</collection></facets><delivery><delcategory>Remote Search Resource</delcategory><fulltext>fulltext</fulltext></delivery><addata><au>Zhou, Zanwei</au><au>Zhong, Ruizhe</au><au>Chen, Yang</au><au>Wang, Yan</au><au>Yang, Xiaokang</au><au>Shen, Wei</au><format>book</format><genre>document</genre><ristype>GEN</ristype><atitle>A K-variate Time Series Is Worth K Words: Evolution of the Vanilla Transformer Architecture for Long-term Multivariate Time Series Forecasting</atitle><jtitle>arXiv.org</jtitle><date>2022-12-06</date><risdate>2022</risdate><eissn>2331-8422</eissn><abstract>Multivariate time series forecasting (MTSF) is a fundamental problem in numerous real-world applications. Recently, Transformer has become the de facto solution for MTSF, especially for the long-term cases. However, except for the one forward operation, the basic configurations in existing MTSF Transformer architectures were barely carefully verified. In this study, we point out that the current tokenization strategy in MTSF Transformer architectures ignores the token uniformity inductive bias of Transformers. Therefore, the vanilla MTSF transformer struggles to capture details in time series and presents inferior performance. Based on this observation, we make a series of evolution on the basic architecture of the vanilla MTSF transformer. We vary the flawed tokenization strategy, along with the decoder structure and embeddings. Surprisingly, the evolved simple transformer architecture is highly effective, which successfully avoids the over-smoothing phenomena in the vanilla MTSF transformer, achieves a more detailed and accurate prediction, and even substantially outperforms the state-of-the-art Transformers that are well-designed for MTSF.</abstract><cop>Ithaca</cop><pub>Cornell University Library, arXiv.org</pub><oa>free_for_read</oa></addata></record>
fulltext	fulltext
identifier	EISSN: 2331-8422
ispartof	arXiv.org, 2022-12
issn	2331-8422
language	eng
recordid	cdi_proquest_journals_2747737077
source	Publicly Available Content Database
subjects	Evolution Forecasting Multivariate analysis Time series Vanilla
title	A K-variate Time Series Is Worth K Words: Evolution of the Vanilla Transformer Architecture for Long-term Multivariate Time Series Forecasting
url	http://sfxeu10.hosted.exlibrisgroup.com/loughborough?ctx_ver=Z39.88-2004&ctx_enc=info:ofi/enc:UTF-8&ctx_tim=2024-12-29T20%3A16%3A45IST&url_ver=Z39.88-2004&url_ctx_fmt=infofi/fmt:kev:mtx:ctx&rfr_id=info:sid/primo.exlibrisgroup.com:primo3-Article-proquest&rft_val_fmt=info:ofi/fmt:kev:mtx:book&rft.genre=document&rft.atitle=A%20K-variate%20Time%20Series%20Is%20Worth%20K%20Words:%20Evolution%20of%20the%20Vanilla%20Transformer%20Architecture%20for%20Long-term%20Multivariate%20Time%20Series%20Forecasting&rft.jtitle=arXiv.org&rft.au=Zhou,%20Zanwei&rft.date=2022-12-06&rft.eissn=2331-8422&rft_id=info:doi/&rft_dat=%3Cproquest%3E2747737077%3C/proquest%3E%3Cgrp_id%3Ecdi_FETCH-proquest_journals_27477370773%3C/grp_id%3E%3Coa%3E%3C/oa%3E%3Curl%3E%3C/url%3E&rft_id=info:oai/&rft_pqid=2747737077&rft_id=info:pmid/&rfr_iscdi=true