Loading…

Multi-View Visual Relationship Detection with Estimated Depth Map

The abundant visual information contained in multi-view images is widely used in computer vision tasks. Existing visual relationship detection frameworks have extended the feature vector to improve model performance. However, single view information can not fully reveal the visual relationships in c...

Full description

Saved in:
Bibliographic Details
Published in:Applied sciences 2022-05, Vol.12 (9), p.4674
Main Authors: Liu, Xiaozhou, Gan, Ming-Gang, He, Yuxuan
Format: Article
Language:English
Subjects:
Citations: Items that this one cites
Online Access:Get full text
Tags: Add Tag
No Tags, Be the first to tag this record!
Description
Summary:The abundant visual information contained in multi-view images is widely used in computer vision tasks. Existing visual relationship detection frameworks have extended the feature vector to improve model performance. However, single view information can not fully reveal the visual relationships in complex visual scenes. To solve this problem and explore the multi-view information in a visual relationship detection (VRD) model, a novel multi-view VRD framework based on a monocular RGB image and an estimated depth map is proposed. The contributions of this paper are threefold. First, we construct a novel multi-view framework which fuses information of different views extracted from estimated RGB-D images. Second, a multi-view image generation method is proposed to transfer flat visual space to 3D multi-view space. Third, we redesign the visual relationship balanced classifier which can process multi-view feature vectors simultaneously. Detailed experiments were conducted on two datasets to demonstrate the effectiveness of the multi-view VRD framework. The experimental results showed that the multi-view VRD framework resulted in state-of-the-art zero-shot learning performance in specific depth conditions.
ISSN:2076-3417
2076-3417
DOI:10.3390/app12094674