Loading…
GBP: Graph convolutional network embedded in bilinear pooling for fine-grained encoding
In fine-grained recognition, classical high-order coding has inherent contradiction between visual burstiness and feature redundancy, the core of which is the inherent instability of high-order features. Existing methods mainly use EIG and SVD decomposition to maintain feature stability, but this pr...
Saved in:
Published in: | Computers & electrical engineering 2024-05, Vol.116, p.109158, Article 109158 |
---|---|
Main Authors: | , , , , |
Format: | Article |
Language: | English |
Subjects: | |
Citations: | Items that this one cites |
Online Access: | Get full text |
Tags: |
Add Tag
No Tags, Be the first to tag this record!
|
Summary: | In fine-grained recognition, classical high-order coding has inherent contradiction between visual burstiness and feature redundancy, the core of which is the inherent instability of high-order features. Existing methods mainly use EIG and SVD decomposition to maintain feature stability, but this process increases feature redundancy. To address this problem, this paper proposes a Graph Bilinear Pooling (GBP) model to obtain stable fine-grained features through the effective aggregation ability of graph networks. GBP avoids explicit feature decomposition and reconciles the contradiction between visual burstiness and feature redundancy. First, GBP transforms images into a graph spectrum through feature correlation measurement. Then, an improved multi-head graph convolution structure is proposed by using Graph Isomorphism Networks (GIN) to realize feature aggregation. Finally, bilinear pooling operations are performed between graph convolution feature maps and original feature maps to obtain more compact and stable fine-grained feature representations. Experiments on CUB, Cars, and Aircrafts datasets demonstrate that the accuracy of the proposed method is 87.8 %, 93.5 %, and 89.6 % respectively, with a feature representation of 2048 dimensions. Compared to the baseline, the feature number is only 25 % of the baseline model, and the accuracy is increased by 2.6 %, 1.7 %, and 1.3 % respectively. These results demonstrate the effectiveness of graph neural network embedding in improving feature stability. |
---|---|
ISSN: | 0045-7906 1879-0755 |
DOI: | 10.1016/j.compeleceng.2024.109158 |