Loading…
Cross-modal transfer with neural word vectors for image feature learning
Neural word vector (NWV) such as word2vec is a powerful text representation tool that can encode extensive semantic information into compact vectors. This ability poses an interesting question in relation to image processing research - Can we learn better semantic image features from NWVs? We empiri...
Saved in:
Main Authors: | , , , , |
---|---|
Format: | Conference Proceeding |
Language: | English |
Subjects: | |
Online Access: | Request full text |
Tags: |
Add Tag
No Tags, Be the first to tag this record!
|
Summary: | Neural word vector (NWV) such as word2vec is a powerful text representation tool that can encode extensive semantic information into compact vectors. This ability poses an interesting question in relation to image processing research - Can we learn better semantic image features from NWVs? We empirically explore this question in the context of semantic content-based image retrieval (CBIR). In this paper, we consider cross-modal transfer learning (CMT) to improve initial convolutional neural network (CNN) image features by using NWVs. We first show that NWVs can improve semantic CBIR performance compared to classical word vectors, even if it is with simple CMT models, i.e., canonical correlation analysis (CCA). Next, inspired by a characteristic property of NWVs, we propose a new CMT model and demonstrate that it can improve CBIR performance even further. |
---|---|
ISSN: | 2379-190X |
DOI: | 10.1109/ICASSP.2017.7952690 |