Loading…

Region-Awared Transformer with Asymmetric Loss in Multi-Label Classification

Multi-label image classification (MLIC) deals with assigning multiple labels to each image, a easy task for human being while still a open problem in machine learning. The greatest challenge in MLIC lies in that different target objects in one image keep distinct viewpoints and scales. One effective...

Full description

Saved in:
Bibliographic Details
Main Authors: Zhang, Lei, Liu, Jie, Bao, Yanqi, Wang, Jie
Format: Conference Proceeding
Language:English
Subjects:
Online Access:Request full text
Tags: Add Tag
No Tags, Be the first to tag this record!
Description
Summary:Multi-label image classification (MLIC) deals with assigning multiple labels to each image, a easy task for human being while still a open problem in machine learning. The greatest challenge in MLIC lies in that different target objects in one image keep distinct viewpoints and scales. One effective way is to borrow the label-related information to guide the selection of interesting region, which will act an important role in classification. By leveraging the attention mechanism in transformer, we propose a region-awared transformer to focus on top related regions and neglect background interference. Furthermore, our approach can cope with the positive-negative imbalance by assigning them different exponential decay factors of positive and negative samples separately. Experiments on MS-COCO show a competitive performance against other state-of-the-art methods.
ISSN:2379-190X
DOI:10.1109/ICASSP49357.2023.10095686