Leveraging Batch Normalization for Vision Transformers

Transformer-based vision architectures have attracted great attention because of the strong performance over the convolutional neural networks (CNNs). Inherited from the NLP tasks, the architectures take Layer Normalization (LN) as a default normalization technique. On the other side, previous visio...

Full description

Saved in:
Bibliographic Details
Main Authors: Yao, Zhuliang, Cao, Yue, Lin, Yutong, Liu, Ze, Zhang, Zheng, Hu, Han
Format: Conference Proceeding
Language:English
Subjects:
Online Access:Request full text
Tags: Add Tag
No Tags, Be the first to tag this record!