Loading…

The Interpretable Fast Multi-Scale Deep Decoder for the Standard HEVC Bitstreams

It is a research hotspot to restore decoded videos with existing bitstreams by applying deep neural network to improve compression efficiency at decoder-end. Existing research has verified that the utilization of redundancy at decoder-end, which is underused by the encoder, can bring an increase of...

Full description

Saved in:

Bibliographic Details
Published in:	IEEE transactions on multimedia 2020-07, Vol.22 (7), p.1680-1691
Main Authors:	Xiao, Wenhui, He, Huiguo, Wang, Tingting, Chao, Hongyang
Format:	Article
Language:	English
Subjects:	Algorithms Artificial neural networks Coders Coding Coding standards compression efficiency Computational efficiency Decoding deep learning Efficiency Encoding Frames (data processing) HEVC interpretability multi-scale similarity Neural networks Redundancy Similarity Video compression Videos
Citations:	Items that this one cites Items that cite this one
Online Access:	Get full text
Tags:	Add Tag No Tags, Be the first to tag this record!

Description
Summary:	It is a research hotspot to restore decoded videos with existing bitstreams by applying deep neural network to improve compression efficiency at decoder-end. Existing research has verified that the utilization of redundancy at decoder-end, which is underused by the encoder, can bring an increase of compression efficiency. However, most existing research neglects the abundant multi-scale information among video frames as a typical type of such redundancy. It remains an interesting yet challenging topic how to build an effective, interpretable and fast deep neural network for the purpose of using the multi-scale similarity at decoder-end and further enhancing compression efficiency. To this end, this paper considers the use of underused inter multi-scale information and proposes the Fast Multi-Scale Deep Decoder (Fast MSDD) for the state-of-the-art video coding standard HEVC. The advantages of Fast MSDD are three-fold. First, it achieves a higher coding efficiency without modifying any encoding algorithm. Second, Fast MSDD is interpretable based on the framework of using the underused redundancy. Third, it guarantees the model's inference speed while fully using the multi-scale similarity among video frames. Extensive experimental results verify Fast MSDD's effectiveness, interpretability, and computational efficiency. Fast MSDD obtains averagely 14.3%, 10.8%, 8.5% and 7.6% BD gains for AI, LP, LB and RA respectively. Compared with our previous work MSDD, Fast MSDD achieves increases of 59.3%, 49.1%, 61.0% and 29.3%. Meanwhile, 16.9%, 11.2%, 9.2% and 8.3% BD gains are observed on videos with scale changes, which validate the interpretability of the proposed method. Furthermore, Fast MSDD can save at most 56.3% time compared to MSDD.
ISSN:	1520-9210 1941-0077
DOI:	10.1109/TMM.2020.2978664