Loading…

Learning attention-guided pyramidal features for few-shot fine-grained recognition

•We propose a two-stage meta-learning framework to learn attention-guided pyramidal features for few-shot fine-grained recognition.•We utilize a multi-scale feature pyramid and a multi-level attention pyramid to extract diverse features from different granularities.•An attention-guided refinement st...

Full description

Saved in:
Bibliographic Details
Published in:Pattern recognition 2022-10, Vol.130, p.108792, Article 108792
Main Authors: Tang, Hao, Yuan, Chengcheng, Li, Zechao, Tang, Jinhui
Format: Article
Language:English
Subjects:
Citations: Items that this one cites
Items that cite this one
Online Access:Get full text
Tags: Add Tag
No Tags, Be the first to tag this record!
Description
Summary:•We propose a two-stage meta-learning framework to learn attention-guided pyramidal features for few-shot fine-grained recognition.•We utilize a multi-scale feature pyramid and a multi-level attention pyramid to extract diverse features from different granularities.•An attention-guided refinement strategy is proposed to enhance the dominative object and eliminate the negative interference of backgrounds.•Extensive experiments demonstrate that the proposed framework significantly improves the performance of few-shot fine-grained recognition. Few-shot fine-grained recognition (FS-FGR) aims to distinguish several highly similar objects from different sub-categories with limited supervision. However, traditional few-shot learning solutions typically exploit image-level features and are committed to capturing global silhouettes while accidentally ignore to exploring local details, resulting in an inevitable problem of inconspicuous but distinguishable information loss. Thus, how to effectively address the fine-grained recognition issue given limited samples still remains a major challenging. In this article, we tend to propose an effective bidirectional pyramid architecture to enhance internal representations of features to cater to fine-grained image recognition task in the few-shot learning scenario. Specifically, we deploy a multi-scale feature pyramid and a multi-level attention pyramid on the backbone network, and progressively aggregated features from different granular spaces via both of them. We then further present an attention-guided refinement strategy in collaboration with a multi-level attention pyramid to reduce the uncertainty brought by backgrounds conditioned by limited samples. In addition, the proposed method is trained with the meta-learning framework in an end-to-end fashion without any extra supervision. Extensive experimental results on four challenging and widely-used fine-grained benchmarks show that the proposed method performs favorably against state-of-the-arts, especially in the one-shot scenarios.
ISSN:0031-3203
1873-5142
DOI:10.1016/j.patcog.2022.108792