Loading…

Learning fine-grained features via a CNN Tree for Large-scale Classification

•We propose to build a tree structure that could progressively learn fine-grained features to distinguish a subset of confusing classes, by learning features only among these classes. This enables our method be capable of recovering test examples misclassified by the basic model.•To learn the tree s...

Full description

Saved in:
Bibliographic Details
Published in:Neurocomputing (Amsterdam) 2018-01, Vol.275, p.1231-1240
Main Authors: Wang, Zhenhua, Wang, Xingxing, Wang, Gang
Format: Article
Language:English
Subjects:
Citations: Items that this one cites
Items that cite this one
Online Access:Get full text
Tags: Add Tag
No Tags, Be the first to tag this record!
Description
Summary:•We propose to build a tree structure that could progressively learn fine-grained features to distinguish a subset of confusing classes, by learning features only among these classes. This enables our method be capable of recovering test examples misclassified by the basic model.•To learn the tree structure, we propose a new learning algorithm to grow the tree in a top-down breadth-first manner.•The key of this learning algorithm is to efficient discover confusion set of each class and merge classes which share common categories in their confusion sets.•We formulate the merging problem as a virtual machine packing problem with general sharing model, and propose a heuristic method to efficiently solve this problem. We propose a novel approach to enhance the discriminability of Convolutional Neural Networks (CNN). The key idea is to build a tree structure that could progressively learn fine-grained features to distinguish a subset of classes, by learning features only among these classes. Such features are expected to be more discriminative, compared to features learned for all the classes. We develop a new algorithm to effectively learn the tree structure from a large number of classes. Experiments on large-scale image classification tasks demonstrate that our method could boost the performance of a given basic CNN model. Our method is quite general, hence it can potentially be used in combination with many other deep learning models. [Display omitted]
ISSN:0925-2312
1872-8286
DOI:10.1016/j.neucom.2017.09.061