Leveraging Batch Normalization for Vision Transformers

Transformer-based vision architectures have attracted great attention because of the strong performance over the convolutional neural networks (CNNs). Inherited from the NLP tasks, the architectures take Layer Normalization (LN) as a default normalization technique. On the other side, previous visio...

Full description

Saved in:

Bibliographic Details
Main Authors:	Yao, Zhuliang, Cao, Yue, Lin, Yutong, Liu, Ze, Zhang, Zheng, Hu, Han
Format:	Conference Proceeding
Language:	English
Subjects:	Computer architecture Computer crashes Computer vision Conferences Feeds Training Transformers
Online Access:	Request full text
Tags:	Add Tag No Tags, Be the first to tag this record!

Description
Staff View