Loading…
G2LP-Net: Global to Local Progressive Video Inpainting Network
The self-attention based video inpainting methods have achieved promising progress by establishing long-range correlation over the whole video. However, existing methods generally relied on the global self-attention that directly searches missing contents among all reference frames but lacks accurat...
Saved in:
Published in: | IEEE transactions on circuits and systems for video technology 2023-03, Vol.33 (3), p.1082-1092 |
---|---|
Main Authors: | , , , , |
Format: | Article |
Language: | English |
Subjects: | |
Citations: | Items that this one cites Items that cite this one |
Online Access: | Get full text |
Tags: |
Add Tag
No Tags, Be the first to tag this record!
|
Summary: | The self-attention based video inpainting methods have achieved promising progress by establishing long-range correlation over the whole video. However, existing methods generally relied on the global self-attention that directly searches missing contents among all reference frames but lacks accurate matching and effective organization on contents, which often blurs the result owing to the loss of local textures. In this paper, we propose a Global-to-Local Progressive Inpainting Network (G2LP-Net) consisting of the following innovative ideas. First, we present a global to local self-attention mechanism by incorporating local self-attention into global self-attention to improve searching efficiency and accuracy, where the self-attention is implemented in multi-scale regions to fully exploit local redundancy for the texture recovery. Second, we propose a progressive video inpainting (PVI) method to organize the generated contents, which completes the target video frames from periphery to core to ensure reliable contents serve first. Last, we develop a window-sliding method for sampling reference frames to obtain rich available information for inpainting. In addition, we release a wire-removal video (WRV) dataset that consists of 150 video clips masked by wires to evaluate the video inpainting on irregularly slender regions. Both quantitative and qualitative experiments on benchmark datasets, DAVIS, YouTube-VOS and our WRV dataset have demonstrated the superiority of our proposed G2LP-Net method. |
---|---|
ISSN: | 1051-8215 1558-2205 |
DOI: | 10.1109/TCSVT.2022.3209548 |