Loading…

Sketch Classification and Sketch Based Image Retrieval Using ViT with Self-Distillation for Few Samples

Sketch-based image retrieval (SBIR) with Zero-Shot are challenging tasks in computer vision, enabling to retrieve photo images relevant to sketch queries that have not been seen in the training phase. For sketch images without a sequence of information, we propose a modified Vision Transformer (ViT)...

Full description

Saved in:
Bibliographic Details
Published in:Journal of electrical engineering & technology 2024-09, Vol.19 (7), p.4587-4593
Main Authors: Kang, Sungjae, Seo, Kisung
Format: Article
Language:English
Subjects:
Citations: Items that this one cites
Online Access:Get full text
Tags: Add Tag
No Tags, Be the first to tag this record!
Description
Summary:Sketch-based image retrieval (SBIR) with Zero-Shot are challenging tasks in computer vision, enabling to retrieve photo images relevant to sketch queries that have not been seen in the training phase. For sketch images without a sequence of information, we propose a modified Vision Transformer (ViT)-based approach that enhances or maintains the performance while reducing the number of sketch training data. First, we add a token for retrieval and integrate auxiliary classifiers of multiple branches ViT network. Second, self-distillation is applied to enable fast transfer learning of sketch domains for our ViT network incorporating addition of classifiers and embedding vectors to each intermediate layers in the network. Third, to address the challenge of overfitting due to reduced input data pairs in training with large datasets, we integrate KL-Divergence, capturing distribution differences between sketches and photos, into the triplet loss, thereby mitigating the impact of limited sketch-photo samples. Experiments on the TU-Berlin and Sketchy dataset demonstrate show that our method performs a significant improvement over other similar methods on sketch classification and sketch-based image retrieval.
ISSN:1975-0102
2093-7423
DOI:10.1007/s42835-024-01889-6