Video Transformers: A Survey

Transformer models have shown great success handling long-range interactions, making them a promising tool for modeling video. However, they lack inductive biases and scale quadratically with input length. These limitations are further exacerbated when dealing with the high dimensionality introduced...

Full description

Saved in:

Bibliographic Details
Published in:	IEEE transactions on pattern analysis and machine intelligence 2023-11, Vol.45 (11), p.12922-12943
Main Authors:	Selva, Javier, Johansen, Anders S., Escalera, Sergio, Nasrollahi, Kamal, Moeslund, Thomas B., Clapes, Albert
Format:	Article
Language:	English
Subjects:	Artificial intelligence Bias computer vision Current transformers Data models Redundancy self-attention Self-supervised learning Task analysis Tokenization Training Transformers video representations Visualization
Citations:	Items that this one cites Items that cite this one
Online Access:	Get full text
Tags:	Add Tag No Tags, Be the first to tag this record!

Description
Staff View