Loading…
Convolutional Neural Networks Based Texture Modeling For AV1
Modern video codecs including the newly developed AOMedia Video 1 (AV1) utilize hybrid coding techniques to remove spatial and temporal redundancy. However, efficient exploitation of statistical dependencies measured by a mean squared error (MSE) does not always produce the best psychovisual result....
Saved in:
Published in: | arXiv.org 2019-08 |
---|---|
Main Authors: | , , , |
Format: | Article |
Language: | English |
Subjects: | |
Online Access: | Get full text |
Tags: |
Add Tag
No Tags, Be the first to tag this record!
|
Summary: | Modern video codecs including the newly developed AOMedia Video 1 (AV1) utilize hybrid coding techniques to remove spatial and temporal redundancy. However, efficient exploitation of statistical dependencies measured by a mean squared error (MSE) does not always produce the best psychovisual result. One interesting approach is to only encode visually relevant information and use a different coding method for "perceptually insignificant" regions in the frame, which can lead to substantial data rate reductions while maintaining visual quality. In this paper, we introduce a texture analyzer before encoding the input sequences to identify "perceptually insignificant" regions in the frame using convolutional neural networks. We designed and developed a new scheme that integrate the texture analyzer into the codec that can largely reduce the temporal flickering artifact for codec with hierarchical coding structure. The proposed method is implemented in AV1 codec by introducing a new coding tool called texture mode, where texture mode is a special inter mode treated at the encoder, that if texture mode is selected, no inter prediction is performed for the identified texture regions. Instead, displacement of the entire region is modeled by just one set of motion parameters. Therefore, only the model parameters are transmitted to the decoder for reconstructing the texture regions. Non-texture regions in the frame are coded conventionally. We show that for many standard test sets, the proposed method achieved significant data rate reductions with satisfying visual quality. |
---|---|
ISSN: | 2331-8422 |