Loading…
MMDF-LDA: An improved Multi-Modal Latent Dirichlet Allocation model for social image annotation
•A multi-modal data fusion model for social images annotation is proposed.•A probability topic model is learned by fusing multi-modal metadata.•Geographical topics are generated from geographical region of social images.•Patches of social images are annotated by the proposed model.•Experiments demon...
Saved in:
Published in: | Expert systems with applications 2018-08, Vol.104, p.168-184 |
---|---|
Main Authors: | , , |
Format: | Article |
Language: | English |
Subjects: | |
Citations: | Items that this one cites Items that cite this one |
Online Access: | Get full text |
Tags: |
Add Tag
No Tags, Be the first to tag this record!
|
Summary: | •A multi-modal data fusion model for social images annotation is proposed.•A probability topic model is learned by fusing multi-modal metadata.•Geographical topics are generated from geographical region of social images.•Patches of social images are annotated by the proposed model.•Experiments demonstrate the effectiveness of the proposed solution.
Social image annotation, which aims at inferring a set of semantic concepts for a social image, is an effective and straightforward way to facilitate social image search. Conventional approaches mainly demonstrated on adopting the visual features and tags, without considering other types of metadata. How to enhance the accuracy of social image annotation by fully exploiting multi-modal features is still an opening and challenging problem. In this paper, we propose an improved Multi-Modal Data Fusion based Latent Dirichlet Allocation (LDA) topic model (MMDF-LDA) to annotate social images via fusing visual content, user-supplied tags, user comments, and geographic information. When MMDF-LDA samples annotations for one data modality, all the other data modalities are exploited. In MMDF-LDA, geographical topics are generated from GPS locations of social images, and annotations have different probability to be used in different geographical regions. A social image is divided into several patches in advance, and then MMDF-LDA assigns annotations for the patches of social images by estimating the probability of annotation-patch assignment. Through experiments in social image annotation and retrieval on several datasets, we demonstrate the effectiveness of the proposed MMDF-LDA model in comparison with state-of-the-art methods. |
---|---|
ISSN: | 0957-4174 1873-6793 |
DOI: | 10.1016/j.eswa.2018.03.014 |