Loading…
Attribute hierarchy based multi-task learning for fine-grained image classification
Fine-grained image classification aims to distinguish subcategories belonging to the same basic-level category, such as 200 subcategories belonging to bird. It is a challenging problem in computer vision and multimedia field due to: attribute similarity (e.g. color and texture) among different subca...
Saved in:
Published in: | Neurocomputing (Amsterdam) 2020-06, Vol.395, p.150-159 |
---|---|
Main Authors: | , , |
Format: | Article |
Language: | English |
Subjects: | |
Citations: | Items that this one cites Items that cite this one |
Online Access: | Get full text |
Tags: |
Add Tag
No Tags, Be the first to tag this record!
|
Summary: | Fine-grained image classification aims to distinguish subcategories belonging to the same basic-level category, such as 200 subcategories belonging to bird. It is a challenging problem in computer vision and multimedia field due to: attribute similarity (e.g. color and texture) among different subcategories and attribute variance (e.g. pose and viewpoint) in the same subcategory. Attribute similarity causes the difficulty to classify different subcategories even for human, while attribute variance causes the learned feature representations chaotic and confused. Naturally, classification can benefit from a hierarchy of subcategories: since going to a coarser granularity leverages high-level semantic features, while going to a finer granularity leverages discriminative and subtle features. Therefore, we propose an attribute hierarchy based multi-task learning (AHMTL) approach, and its main novelties are: (1) Attribute hierarchy: We reassign all images to multi-granularity subcategories automatically, namely coarse-grained, fine-grained and ultra-fine-grained subcategories. Similar fine-grained subcategories are reassigned to the same coarse-grained subcategory according to their attribute similarity, which pays more attention to the high-level semantic feature representations. Simultaneously, the same fine-grained subcategory but with different attributes are divided into different ultra-fine-grained subcategories, which can obtain more discriminative and subtle feature representations for attribute variance. (2) Multi-task learning: A multi-task learning framework is designed to effectively learn robust feature representations by jointly optimizing coarse-grained, fine-grained and ultra-fine-grained image classification tasks. These three level tasks learn coarse to fine feature representations, meaning high-level semantic to subtle features, which can regularize and boost each other to prevent overfitting, and the mutual promotion of them ensures feature representations more discriminative. Compared with more than 10 state-of-the-art methods on two widely-used CUB-200-2011 and Cars-196 datasets, our AHMTL approach achieves the best classification performance. |
---|---|
ISSN: | 0925-2312 1872-8286 |
DOI: | 10.1016/j.neucom.2018.02.109 |