Loading…

Video object tracking and segmentation with box annotation

This paper presents a two-stage approach, track and then segment, to perform semi-supervised video object segmentation (VOS) with only bounding box annotations. The proposed reverse optimization for VOS (ROVOS) which leverages a fully convolutional Siamese network performs tracking and segmentation...

Full description

Saved in:
Bibliographic Details
Published in:Signal processing. Image communication 2020-07, Vol.85, p.115858, Article 115858
Main Authors: Wang, Ye, Choi, Jongmoo, Zhang, Kaitai, Huang, Qin, Chen, Yueru, Lee, Ming-Sui, Kuo, C.-C. Jay
Format: Article
Language:English
Subjects:
Citations: Items that this one cites
Items that cite this one
Online Access:Get full text
Tags: Add Tag
No Tags, Be the first to tag this record!
Description
Summary:This paper presents a two-stage approach, track and then segment, to perform semi-supervised video object segmentation (VOS) with only bounding box annotations. The proposed reverse optimization for VOS (ROVOS) which leverages a fully convolutional Siamese network performs tracking and segmentation in the tracker. The segmentation cues are able to reversely optimize the location of the tracker and the object segmentation masks are produced by the two-branch system online. The experimental results on DAVIS 2016 and DAVIS 2017 demonstrate significant improvements of the proposed algorithm over the state-of-the-art methods. •A two-stage framework with the box annotation reduces the runtime by a large margin.•The box annotation automatically segments the object in the remaining frames.•The segmentation cues are used to perform reverse optimization to locate the objects.•The framework outperforms state-of-the-art methods in terms of mean IoU scores.
ISSN:0923-5965
1879-2677
DOI:10.1016/j.image.2020.115858