Loading…

HaVQA: A Dataset for Visual Question Answering and Multimodal Research in Hausa Language

This paper presents HaVQA, the first multimodal dataset for visual question-answering (VQA) tasks in the Hausa language. The dataset was created by manually translating 6,022 English question-answer pairs, which are associated with 1,555 unique images from the Visual Genome dataset. As a result, the...

Full description

Saved in:

Bibliographic Details
Published in:	arXiv.org 2023-05
Main Authors:	Parida, Shantipriya, Idris Abdulmumin, Shamsuddeen Hassan Muhammad, Bose, Aneesh, Guneet Singh Kohli, Ibrahim Said Ahmad, Kotwal, Ketan, Sayan Deb Sarkar, Bojar, Ondřej, Habeebah Adamu Kakudi
Format:	Article
Language:	English
Subjects:	Datasets English language Machine translation Questions Sentences Translating Visual tasks
Online Access:	Get full text
Tags:	Add Tag No Tags, Be the first to tag this record!

Description
Summary:	This paper presents HaVQA, the first multimodal dataset for visual question-answering (VQA) tasks in the Hausa language. The dataset was created by manually translating 6,022 English question-answer pairs, which are associated with 1,555 unique images from the Visual Genome dataset. As a result, the dataset provides 12,044 gold standard English-Hausa parallel sentences that were translated in a fashion that guarantees their semantic match with the corresponding visual information. We conducted several baseline experiments on the dataset, including visual question answering, visual question elicitation, text-only and multimodal machine translation.
ISSN:	2331-8422