Loading…

Scale-Aware Automatic Augmentations for Object Detection With Dynamic Training

Data augmentation is a critical technique in object detection, especially the augmentations targeting at scale invariance training (scale-aware augmentation). However, there has been little systematic investigation of how to design scale-aware data augmentation for object detection. We propose Scale...

Full description

Saved in:

Bibliographic Details
Published in:	IEEE transactions on pattern analysis and machine intelligence 2023-02, Vol.45 (2), p.2367-2383
Main Authors:	Chen, Yukang, Zhang, Peizhen, Kong, Tao, Li, Yanwei, Zhang, Xiangyu, Qi, Lu, Sun, Jian, Jia, Jiaya
Format:	Article
Language:	English
Subjects:	Adaptation models automatic augmentation Data augmentation Detectors dynamic training Image color analysis Image segmentation Object detection Object recognition Optimization Policies Scale invariance Scale-aware Searching Task analysis Training
Citations:	Items that this one cites Items that cite this one
Online Access:	Get full text
Tags:	Add Tag No Tags, Be the first to tag this record!

Description
Summary:	Data augmentation is a critical technique in object detection, especially the augmentations targeting at scale invariance training (scale-aware augmentation). However, there has been little systematic investigation of how to design scale-aware data augmentation for object detection. We propose Scale-aware AutoAug to learn data augmentation policies for object detection. We define a new scale-aware search space, where both image- and instance-level augmentations are designed for maintaining scale robust feature learning. Upon this search space, we propose a new search metric, termed Pareto Scale Balance, to facilitate efficient augmentation policy search. In experiments, Scale-aware AutoAug yields significant and consistent improvement on various object detectors (e.g., RetinaNet, Faster R-CNN, Mask R-CNN, and FCOS), even compared with strong multi-scale training baselines. Our searched augmentation policies are generalized well to other datasets and instance-level tasks beyond object detection, e.g., instance segmentation. The search cost is much less than previous automated augmentation approaches for object detection, i.e., 8 GPUs across 2.5 days versus. 800 TPU-days. In addition, meaningful patterns can be summarized from our searched policies, which intuitively provide valuable knowledge for hand-crafted data augmentation design. Based on the searched scale-aware augmentation policies, we further introduce a dynamic training paradigm to adaptively determine specific augmentation policy usage during training. The dynamic paradigm consists of an heuristic manner for image-level augmentations and a differentiable copy-paste-based method for instance-level augmentations. The dynamic paradigm achieves further performance improvements to Scale-aware AutoAug without any additional burden on the long tailed LVIS benchmarks. We also demonstrate its ability to prevent over-fitting for large models, e.g., the Swin Transformer large model. Code and models are available at https://github.com/dvlab-research/SA-AutoAug .
ISSN:	0162-8828 1939-3539 2160-9292
DOI:	10.1109/TPAMI.2022.3166905