Loading…

Semantic-Aware Triplet Loss for Image Classification

Successful image classification requires a discriminative representation learning model for images. To approach this idea, deep metric learning (DML), serving as building a basic feature space with a pre-defined metric, has demonstrated compelling performance over the years. DML is often implemented...

Full description

Saved in:
Bibliographic Details
Published in:IEEE transactions on multimedia 2023, Vol.25, p.4563-4572
Main Authors: Wang, Guangzhi, Guo, Yangyang, Xu, Ziwei, Wong, Yongkang, Kankanhalli, Mohan S.
Format: Article
Language:English
Subjects:
Citations: Items that this one cites
Online Access:Get full text
Tags: Add Tag
No Tags, Be the first to tag this record!
Description
Summary:Successful image classification requires a discriminative representation learning model for images. To approach this idea, deep metric learning (DML), serving as building a basic feature space with a pre-defined metric, has demonstrated compelling performance over the years. DML is often implemented with a carefully crafted loss function, such as the representative triplet loss, which encourages a positive sample to be by a fixed margin closer to the anchor than the negative. Despite its efficacy, the negative samples are treated uniformly, rendering the feature space less informative since different negative samples can be largely different from the anchor. In this work, we, for the first time, propose to exploit the semantic information inherent in discrete class labels as an aid for the triplet loss. Specifically, we build a bi-level negative sampling strategy, i . e ., strong negative and weak negative sampling, with the guidance of an external knowledge source, from which rich class semantics can be extracted. With several fine-grained and complementary triplet losses based on this strategy, our method is enhanced with semantic awareness for image classification. In addition, to coordinate with the complicated training dynamics, we devise an ad-hoc Semantic Relation Weighting module, which consistently inspects model states and dynamically adjusts the importance of each triplet loss. It is worth noting that our method is plug-and-play, and we thus test its validity over various backbones and knowledge sources. Both qualitative and quantitative experimental results on benchmark datasets demonstrate the effectiveness of employing semantics for image classification.
ISSN:1520-9210
1941-0077
DOI:10.1109/TMM.2022.3177929