Loading…
A K-variate Time Series Is Worth K Words: Evolution of the Vanilla Transformer Architecture for Long-term Multivariate Time Series Forecasting
Multivariate time series forecasting (MTSF) is a fundamental problem in numerous real-world applications. Recently, Transformer has become the de facto solution for MTSF, especially for the long-term cases. However, except for the one forward operation, the basic configurations in existing MTSF Tran...
Saved in:
Published in: | arXiv.org 2022-12 |
---|---|
Main Authors: | , , , , , |
Format: | Article |
Language: | English |
Subjects: | |
Online Access: | Get full text |
Tags: |
Add Tag
No Tags, Be the first to tag this record!
|
cited_by | |
---|---|
cites | |
container_end_page | |
container_issue | |
container_start_page | |
container_title | arXiv.org |
container_volume | |
creator | Zhou, Zanwei Zhong, Ruizhe Chen, Yang Wang, Yan Yang, Xiaokang Shen, Wei |
description | Multivariate time series forecasting (MTSF) is a fundamental problem in numerous real-world applications. Recently, Transformer has become the de facto solution for MTSF, especially for the long-term cases. However, except for the one forward operation, the basic configurations in existing MTSF Transformer architectures were barely carefully verified. In this study, we point out that the current tokenization strategy in MTSF Transformer architectures ignores the token uniformity inductive bias of Transformers. Therefore, the vanilla MTSF transformer struggles to capture details in time series and presents inferior performance. Based on this observation, we make a series of evolution on the basic architecture of the vanilla MTSF transformer. We vary the flawed tokenization strategy, along with the decoder structure and embeddings. Surprisingly, the evolved simple transformer architecture is highly effective, which successfully avoids the over-smoothing phenomena in the vanilla MTSF transformer, achieves a more detailed and accurate prediction, and even substantially outperforms the state-of-the-art Transformers that are well-designed for MTSF. |
format | article |
fullrecord | <record><control><sourceid>proquest</sourceid><recordid>TN_cdi_proquest_journals_2747737077</recordid><sourceformat>XML</sourceformat><sourcesystem>PC</sourcesystem><sourcerecordid>2747737077</sourcerecordid><originalsourceid>FETCH-proquest_journals_27477370773</originalsourceid><addsrcrecordid>eNqNjUFLw0AQhRdBsNT8hwHPgbhpXfFWpEWpngz1WJY4SbZsdnRmtj-jv7kpePTg5X3wvgfvysxsXd-Xjwtrb0whcqiqyj44u1zWM3NawbY8eg5eEZowInwgBxR4Ffgk1gG2F37JE6yPFLMGSkAd6ICw8ynE6KFhn6QjHpFhxe0QFFvNjDB18EapLxV5hPccNfx1tSHG1ouG1N-a685HweKXc3O3WTfPL-U3009G0f2BMqdJ7a1bOFe7aor_rc4k_lUW</addsrcrecordid><sourcetype>Aggregation Database</sourcetype><iscdi>true</iscdi><recordtype>article</recordtype><pqid>2747737077</pqid></control><display><type>article</type><title>A K-variate Time Series Is Worth K Words: Evolution of the Vanilla Transformer Architecture for Long-term Multivariate Time Series Forecasting</title><source>Publicly Available Content Database</source><creator>Zhou, Zanwei ; Zhong, Ruizhe ; Chen, Yang ; Wang, Yan ; Yang, Xiaokang ; Shen, Wei</creator><creatorcontrib>Zhou, Zanwei ; Zhong, Ruizhe ; Chen, Yang ; Wang, Yan ; Yang, Xiaokang ; Shen, Wei</creatorcontrib><description>Multivariate time series forecasting (MTSF) is a fundamental problem in numerous real-world applications. Recently, Transformer has become the de facto solution for MTSF, especially for the long-term cases. However, except for the one forward operation, the basic configurations in existing MTSF Transformer architectures were barely carefully verified. In this study, we point out that the current tokenization strategy in MTSF Transformer architectures ignores the token uniformity inductive bias of Transformers. Therefore, the vanilla MTSF transformer struggles to capture details in time series and presents inferior performance. Based on this observation, we make a series of evolution on the basic architecture of the vanilla MTSF transformer. We vary the flawed tokenization strategy, along with the decoder structure and embeddings. Surprisingly, the evolved simple transformer architecture is highly effective, which successfully avoids the over-smoothing phenomena in the vanilla MTSF transformer, achieves a more detailed and accurate prediction, and even substantially outperforms the state-of-the-art Transformers that are well-designed for MTSF.</description><identifier>EISSN: 2331-8422</identifier><language>eng</language><publisher>Ithaca: Cornell University Library, arXiv.org</publisher><subject>Evolution ; Forecasting ; Multivariate analysis ; Time series ; Vanilla</subject><ispartof>arXiv.org, 2022-12</ispartof><rights>2022. This work is published under http://arxiv.org/licenses/nonexclusive-distrib/1.0/ (the “License”). Notwithstanding the ProQuest Terms and Conditions, you may use this content in accordance with the terms of the License.</rights><oa>free_for_read</oa><woscitedreferencessubscribed>false</woscitedreferencessubscribed></display><links><openurl>$$Topenurl_article</openurl><openurlfulltext>$$Topenurlfull_article</openurlfulltext><thumbnail>$$Tsyndetics_thumb_exl</thumbnail><linktohtml>$$Uhttps://www.proquest.com/docview/2747737077?pq-origsite=primo$$EHTML$$P50$$Gproquest$$Hfree_for_read</linktohtml><link.rule.ids>780,784,25753,37012,44590</link.rule.ids></links><search><creatorcontrib>Zhou, Zanwei</creatorcontrib><creatorcontrib>Zhong, Ruizhe</creatorcontrib><creatorcontrib>Chen, Yang</creatorcontrib><creatorcontrib>Wang, Yan</creatorcontrib><creatorcontrib>Yang, Xiaokang</creatorcontrib><creatorcontrib>Shen, Wei</creatorcontrib><title>A K-variate Time Series Is Worth K Words: Evolution of the Vanilla Transformer Architecture for Long-term Multivariate Time Series Forecasting</title><title>arXiv.org</title><description>Multivariate time series forecasting (MTSF) is a fundamental problem in numerous real-world applications. Recently, Transformer has become the de facto solution for MTSF, especially for the long-term cases. However, except for the one forward operation, the basic configurations in existing MTSF Transformer architectures were barely carefully verified. In this study, we point out that the current tokenization strategy in MTSF Transformer architectures ignores the token uniformity inductive bias of Transformers. Therefore, the vanilla MTSF transformer struggles to capture details in time series and presents inferior performance. Based on this observation, we make a series of evolution on the basic architecture of the vanilla MTSF transformer. We vary the flawed tokenization strategy, along with the decoder structure and embeddings. Surprisingly, the evolved simple transformer architecture is highly effective, which successfully avoids the over-smoothing phenomena in the vanilla MTSF transformer, achieves a more detailed and accurate prediction, and even substantially outperforms the state-of-the-art Transformers that are well-designed for MTSF.</description><subject>Evolution</subject><subject>Forecasting</subject><subject>Multivariate analysis</subject><subject>Time series</subject><subject>Vanilla</subject><issn>2331-8422</issn><fulltext>true</fulltext><rsrctype>article</rsrctype><creationdate>2022</creationdate><recordtype>article</recordtype><sourceid>PIMPY</sourceid><recordid>eNqNjUFLw0AQhRdBsNT8hwHPgbhpXfFWpEWpngz1WJY4SbZsdnRmtj-jv7kpePTg5X3wvgfvysxsXd-Xjwtrb0whcqiqyj44u1zWM3NawbY8eg5eEZowInwgBxR4Ffgk1gG2F37JE6yPFLMGSkAd6ICw8ynE6KFhn6QjHpFhxe0QFFvNjDB18EapLxV5hPccNfx1tSHG1ouG1N-a685HweKXc3O3WTfPL-U3009G0f2BMqdJ7a1bOFe7aor_rc4k_lUW</recordid><startdate>20221206</startdate><enddate>20221206</enddate><creator>Zhou, Zanwei</creator><creator>Zhong, Ruizhe</creator><creator>Chen, Yang</creator><creator>Wang, Yan</creator><creator>Yang, Xiaokang</creator><creator>Shen, Wei</creator><general>Cornell University Library, arXiv.org</general><scope>8FE</scope><scope>8FG</scope><scope>ABJCF</scope><scope>ABUWG</scope><scope>AFKRA</scope><scope>AZQEC</scope><scope>BENPR</scope><scope>BGLVJ</scope><scope>CCPQU</scope><scope>DWQXO</scope><scope>HCIFZ</scope><scope>L6V</scope><scope>M7S</scope><scope>PIMPY</scope><scope>PQEST</scope><scope>PQQKQ</scope><scope>PQUKI</scope><scope>PRINS</scope><scope>PTHSS</scope></search><sort><creationdate>20221206</creationdate><title>A K-variate Time Series Is Worth K Words: Evolution of the Vanilla Transformer Architecture for Long-term Multivariate Time Series Forecasting</title><author>Zhou, Zanwei ; Zhong, Ruizhe ; Chen, Yang ; Wang, Yan ; Yang, Xiaokang ; Shen, Wei</author></sort><facets><frbrtype>5</frbrtype><frbrgroupid>cdi_FETCH-proquest_journals_27477370773</frbrgroupid><rsrctype>articles</rsrctype><prefilter>articles</prefilter><language>eng</language><creationdate>2022</creationdate><topic>Evolution</topic><topic>Forecasting</topic><topic>Multivariate analysis</topic><topic>Time series</topic><topic>Vanilla</topic><toplevel>online_resources</toplevel><creatorcontrib>Zhou, Zanwei</creatorcontrib><creatorcontrib>Zhong, Ruizhe</creatorcontrib><creatorcontrib>Chen, Yang</creatorcontrib><creatorcontrib>Wang, Yan</creatorcontrib><creatorcontrib>Yang, Xiaokang</creatorcontrib><creatorcontrib>Shen, Wei</creatorcontrib><collection>ProQuest SciTech Collection</collection><collection>ProQuest Technology Collection</collection><collection>Materials Science & Engineering Collection</collection><collection>ProQuest Central (Alumni)</collection><collection>ProQuest Central</collection><collection>ProQuest Central Essentials</collection><collection>ProQuest Central</collection><collection>Technology Collection</collection><collection>ProQuest One Community College</collection><collection>ProQuest Central Korea</collection><collection>SciTech Premium Collection</collection><collection>ProQuest Engineering Collection</collection><collection>Engineering Database</collection><collection>Publicly Available Content Database</collection><collection>ProQuest One Academic Eastern Edition (DO NOT USE)</collection><collection>ProQuest One Academic</collection><collection>ProQuest One Academic UKI Edition</collection><collection>ProQuest Central China</collection><collection>Engineering Collection</collection></facets><delivery><delcategory>Remote Search Resource</delcategory><fulltext>fulltext</fulltext></delivery><addata><au>Zhou, Zanwei</au><au>Zhong, Ruizhe</au><au>Chen, Yang</au><au>Wang, Yan</au><au>Yang, Xiaokang</au><au>Shen, Wei</au><format>book</format><genre>document</genre><ristype>GEN</ristype><atitle>A K-variate Time Series Is Worth K Words: Evolution of the Vanilla Transformer Architecture for Long-term Multivariate Time Series Forecasting</atitle><jtitle>arXiv.org</jtitle><date>2022-12-06</date><risdate>2022</risdate><eissn>2331-8422</eissn><abstract>Multivariate time series forecasting (MTSF) is a fundamental problem in numerous real-world applications. Recently, Transformer has become the de facto solution for MTSF, especially for the long-term cases. However, except for the one forward operation, the basic configurations in existing MTSF Transformer architectures were barely carefully verified. In this study, we point out that the current tokenization strategy in MTSF Transformer architectures ignores the token uniformity inductive bias of Transformers. Therefore, the vanilla MTSF transformer struggles to capture details in time series and presents inferior performance. Based on this observation, we make a series of evolution on the basic architecture of the vanilla MTSF transformer. We vary the flawed tokenization strategy, along with the decoder structure and embeddings. Surprisingly, the evolved simple transformer architecture is highly effective, which successfully avoids the over-smoothing phenomena in the vanilla MTSF transformer, achieves a more detailed and accurate prediction, and even substantially outperforms the state-of-the-art Transformers that are well-designed for MTSF.</abstract><cop>Ithaca</cop><pub>Cornell University Library, arXiv.org</pub><oa>free_for_read</oa></addata></record> |
fulltext | fulltext |
identifier | EISSN: 2331-8422 |
ispartof | arXiv.org, 2022-12 |
issn | 2331-8422 |
language | eng |
recordid | cdi_proquest_journals_2747737077 |
source | Publicly Available Content Database |
subjects | Evolution Forecasting Multivariate analysis Time series Vanilla |
title | A K-variate Time Series Is Worth K Words: Evolution of the Vanilla Transformer Architecture for Long-term Multivariate Time Series Forecasting |
url | http://sfxeu10.hosted.exlibrisgroup.com/loughborough?ctx_ver=Z39.88-2004&ctx_enc=info:ofi/enc:UTF-8&ctx_tim=2024-12-29T20%3A16%3A45IST&url_ver=Z39.88-2004&url_ctx_fmt=infofi/fmt:kev:mtx:ctx&rfr_id=info:sid/primo.exlibrisgroup.com:primo3-Article-proquest&rft_val_fmt=info:ofi/fmt:kev:mtx:book&rft.genre=document&rft.atitle=A%20K-variate%20Time%20Series%20Is%20Worth%20K%20Words:%20Evolution%20of%20the%20Vanilla%20Transformer%20Architecture%20for%20Long-term%20Multivariate%20Time%20Series%20Forecasting&rft.jtitle=arXiv.org&rft.au=Zhou,%20Zanwei&rft.date=2022-12-06&rft.eissn=2331-8422&rft_id=info:doi/&rft_dat=%3Cproquest%3E2747737077%3C/proquest%3E%3Cgrp_id%3Ecdi_FETCH-proquest_journals_27477370773%3C/grp_id%3E%3Coa%3E%3C/oa%3E%3Curl%3E%3C/url%3E&rft_id=info:oai/&rft_pqid=2747737077&rft_id=info:pmid/&rfr_iscdi=true |