Loading…

Video Snapshot Compressive Imaging Using Residual Ensemble Network

Video snapshot compressive imaging (SCI) system enables high-frame-rate imaging by projecting multiple frames into a 2D snapshot measurement during a single exposure, and the original video frames can be reconstructed by solving an optimization problem. However, existing methods usually cannot achie...

Full description

Saved in:
Bibliographic Details
Published in:IEEE transactions on circuits and systems for video technology 2022-09, Vol.32 (9), p.5931-5943
Main Authors: Sun, Yubao, Chen, Xunhao, Kankanhalli, Mohan S., Liu, Qingshan, Li, Junxia
Format: Article
Language:English
Subjects:
Citations: Items that this one cites
Items that cite this one
Online Access:Get full text
Tags: Add Tag
No Tags, Be the first to tag this record!
Description
Summary:Video snapshot compressive imaging (SCI) system enables high-frame-rate imaging by projecting multiple frames into a 2D snapshot measurement during a single exposure, and the original video frames can be reconstructed by solving an optimization problem. However, existing methods usually cannot achieve a good balance between reconstruction time and reconstruction quality, which has become a major obstacle for practical application of video SCI. In order to cope with this issue, we propose a residual ensemble network to learn the explicit inverse mapping from the 2D snapshot measurement to the original video. Specifically, the proposed network aims to exploit the spatiotemporal correlations between video frames for improving reconstruction quality. The spatiotemporal correlations of video frames demonstrate multiple types, including intra-frame spatial correlation, inter-frame forward and backward temporal correlation. With the purpose of fully capturing these differentiated correlations, we design four sub-networks, namely, a pseudo-3D U-shape sub-network, two residual sub-networks, and a serial forward and backward recurrent sub-network, and further assemble these four sub-networks into an ensemble network through alternate residual links. This ensemble network can effectively fuse the predictions of each sub-network and maintain spatiotemporal consistency between video frames. We further design a compound loss function to guide the network learning, and the new video can be fast reconstructed by simply feeding its 2D snapshot measurement into the learned network. The experimental results demonstrate that our network can significantly improve the reconstruction quality while maintaining low computational cost.
ISSN:1051-8215
1558-2205
DOI:10.1109/TCSVT.2022.3164241