Loading…

Scale-Aware Automatic Augmentations for Object Detection With Dynamic Training

Data augmentation is a critical technique in object detection, especially the augmentations targeting at scale invariance training (scale-aware augmentation). However, there has been little systematic investigation of how to design scale-aware data augmentation for object detection. We propose Scale...

Full description

Saved in:
Bibliographic Details
Published in:IEEE transactions on pattern analysis and machine intelligence 2023-02, Vol.45 (2), p.2367-2383
Main Authors: Chen, Yukang, Zhang, Peizhen, Kong, Tao, Li, Yanwei, Zhang, Xiangyu, Qi, Lu, Sun, Jian, Jia, Jiaya
Format: Article
Language:English
Subjects:
Citations: Items that this one cites
Items that cite this one
Online Access:Get full text
Tags: Add Tag
No Tags, Be the first to tag this record!
Description
Summary:Data augmentation is a critical technique in object detection, especially the augmentations targeting at scale invariance training (scale-aware augmentation). However, there has been little systematic investigation of how to design scale-aware data augmentation for object detection. We propose Scale-aware AutoAug to learn data augmentation policies for object detection. We define a new scale-aware search space, where both image- and instance-level augmentations are designed for maintaining scale robust feature learning. Upon this search space, we propose a new search metric, termed Pareto Scale Balance, to facilitate efficient augmentation policy search. In experiments, Scale-aware AutoAug yields significant and consistent improvement on various object detectors (e.g., RetinaNet, Faster R-CNN, Mask R-CNN, and FCOS), even compared with strong multi-scale training baselines. Our searched augmentation policies are generalized well to other datasets and instance-level tasks beyond object detection, e.g., instance segmentation. The search cost is much less than previous automated augmentation approaches for object detection, i.e., 8 GPUs across 2.5 days versus. 800 TPU-days. In addition, meaningful patterns can be summarized from our searched policies, which intuitively provide valuable knowledge for hand-crafted data augmentation design. Based on the searched scale-aware augmentation policies, we further introduce a dynamic training paradigm to adaptively determine specific augmentation policy usage during training. The dynamic paradigm consists of an heuristic manner for image-level augmentations and a differentiable copy-paste-based method for instance-level augmentations. The dynamic paradigm achieves further performance improvements to Scale-aware AutoAug without any additional burden on the long tailed LVIS benchmarks. We also demonstrate its ability to prevent over-fitting for large models, e.g., the Swin Transformer large model. Code and models are available at https://github.com/dvlab-research/SA-AutoAug .
ISSN:0162-8828
1939-3539
2160-9292
DOI:10.1109/TPAMI.2022.3166905