Loading…
Boosting Few-shot visual recognition via saliency-guided complementary attention
•Boosting few-shot learning (FSL) via saliency guidance provided only in training.•A unified framework for improved feature learning and classification head building.•Generating more accurate spatial attention maps compared to saliency predictions.•SGCA outperforms other saliency-based FSL works by...
Saved in:
Published in: | Neurocomputing (Amsterdam) 2022-10, Vol.507, p.412-427 |
---|---|
Main Authors: | , , , , |
Format: | Article |
Language: | English |
Subjects: | |
Citations: | Items that this one cites Items that cite this one |
Online Access: | Get full text |
Tags: |
Add Tag
No Tags, Be the first to tag this record!
|
Summary: | •Boosting few-shot learning (FSL) via saliency guidance provided only in training.•A unified framework for improved feature learning and classification head building.•Generating more accurate spatial attention maps compared to saliency predictions.•SGCA outperforms other saliency-based FSL works by a significant margin.•Achieving SOTA performance on multiple few-shot recognition datasets and scenarios.
Despite significant progress in recent deep neural networks, most deep learning algorithms rely heavily on abundant training samples. To address the issue, few-shot learning (FSL) methods are designed to learn models that can generalize to novel classes with limited training data. In this work, we propose an effective and interpretable FSL approach termed Saliency-Guided Complementary Attention (SGCA). Concretely, SGCA aims to boost few-shot visual recognition from two perspectives: learning generalizable feature representations and building a robust classification module in a unified framework. For generalizable representation learning, we propose to explore the intrinsic structure of natural images by training the feature extractor with an auxiliary task to segment foreground regions from background clutter. The guidance signals are provided during training by a saliency detector which highlights object regions in images corresponding to the human visual system. Moreover, for robust classification module building, we introduce a complementary attention mechanism based on the learned segmentation to make the classification module focus on various informative parts of the image. Extensive experiments on 5 popular FSL datasets demonstrate that SGCA can outperform state-of-the-art approaches by a significant margin. In addition, extensions of SGCA to other challenging scenarios, including generalized, transductive and semi-supervised FSL, also verify the effectiveness and flexibility of our proposed approach. |
---|---|
ISSN: | 0925-2312 1872-8286 |
DOI: | 10.1016/j.neucom.2022.08.028 |