Loading…
Region-Awared Transformer with Asymmetric Loss in Multi-Label Classification
Multi-label image classification (MLIC) deals with assigning multiple labels to each image, a easy task for human being while still a open problem in machine learning. The greatest challenge in MLIC lies in that different target objects in one image keep distinct viewpoints and scales. One effective...
Saved in:
Main Authors: | , , , |
---|---|
Format: | Conference Proceeding |
Language: | English |
Subjects: | |
Online Access: | Request full text |
Tags: |
Add Tag
No Tags, Be the first to tag this record!
|
Summary: | Multi-label image classification (MLIC) deals with assigning multiple labels to each image, a easy task for human being while still a open problem in machine learning. The greatest challenge in MLIC lies in that different target objects in one image keep distinct viewpoints and scales. One effective way is to borrow the label-related information to guide the selection of interesting region, which will act an important role in classification. By leveraging the attention mechanism in transformer, we propose a region-awared transformer to focus on top related regions and neglect background interference. Furthermore, our approach can cope with the positive-negative imbalance by assigning them different exponential decay factors of positive and negative samples separately. Experiments on MS-COCO show a competitive performance against other state-of-the-art methods. |
---|---|
ISSN: | 2379-190X |
DOI: | 10.1109/ICASSP49357.2023.10095686 |