Loading…

A multi-head attention-based transformer model for traffic flow forecasting with a comparative analysis to recurrent neural networks

Traffic flow forecasting is an essential component of an intelligent transportation system to mitigate congestion. Recurrent neural networks, particularly gated recurrent units and long short-term memory, have been the state-of-the-art traffic flow forecasting models for the last few years. However,...

Full description

Saved in:
Bibliographic Details
Published in:Expert systems with applications 2022-09, Vol.202, p.117275, Article 117275
Main Authors: Reza, Selim, Ferreira, Marta Campos, Machado, J.J.M., Tavares, João Manuel R.S.
Format: Article
Language:English
Subjects:
Citations: Items that this one cites
Items that cite this one
Online Access:Get full text
Tags: Add Tag
No Tags, Be the first to tag this record!
Description
Summary:Traffic flow forecasting is an essential component of an intelligent transportation system to mitigate congestion. Recurrent neural networks, particularly gated recurrent units and long short-term memory, have been the state-of-the-art traffic flow forecasting models for the last few years. However, a more sophisticated and resilient model is necessary to effectively acquire long-range correlations in the time-series data sequence under analysis. The dominant performance of transformers by overcoming the drawbacks of recurrent neural networks in natural language processing might tackle this need and lead to successful time-series forecasting. This article presents a multi-head attention based transformer model for traffic flow forecasting with a comparative analysis between a gated recurrent unit and a long-short term memory-based model on PeMS dataset in this context. The model uses 5 heads with 5 identical layers of encoder and decoder and relies on Square Subsequent Masking techniques. The results demonstrate the promising performance of the transform-based model in predicting long-term traffic flow patterns effectively after feeding it with substantial amount of data. It also demonstrates its worthiness by increasing the mean squared errors and mean absolute percentage errors by (1.25−47.8)% and (32.4−83.8)%, respectively, concerning the current baselines. •Applicability of transformers in traffic state forecasting is justified.•A comprehensive performance comparison with GRU and LSTM is presented.•Transformers need to be fed with big data to get good performance.•Transformers are more suitable in gaining long-range features than GRU or LSTM.•Proposed model improves the mean absolute percentage error by over related baselines.
ISSN:0957-4174
1873-6793
DOI:10.1016/j.eswa.2022.117275