Loading…
A ViT‐Based Adaptive Recurrent Mobilenet With Attention Network for Video Compression and Bit‐Rate Reduction Using Improved Heuristic Approach Under Versatile Video Coding
ABSTRACT Video compression received attention from the communities of video processing and deep learning. Modern learning‐aided mechanisms use a hybrid coding approach to reduce redundancy in pixel space across time and space, improving motion compensation accuracy. The experiments in video compress...
Saved in:
Published in: | Computational intelligence 2024-12, Vol.40 (6), p.n/a |
---|---|
Main Authors: | , |
Format: | Article |
Language: | English |
Subjects: | |
Online Access: | Get full text |
Tags: |
Add Tag
No Tags, Be the first to tag this record!
|
Summary: | ABSTRACT
Video compression received attention from the communities of video processing and deep learning. Modern learning‐aided mechanisms use a hybrid coding approach to reduce redundancy in pixel space across time and space, improving motion compensation accuracy. The experiments in video compression have important improvements in past years. The Versatile Video Coding (VVC) is the primary enhancing standard of video compression which is also referred to as H. 226. The VVC codec is a block‐assisted hybrid codec, making it highly capable and complex. Video coding effectively compresses data while reducing compression artifacts, enhancing the quality and functionality of AI video technologies. However, the traditional models suffer from the incorrect compression of the motion and ineffective compensation frameworks of the motion leading to compression faults with a minimal trade‐off of the rate distortion. This work implements an automated and effective video compression task under VVC using a deep learning approach. Motion estimation is conducted using the Motion Vector (MV) encoder‐decoder model to track movements in the video. Based on these MV, the reconstruction of the frame is carried out to compensate for the motions. The residual images are obtained by using Vision Transformer‐based Adaptive Recurrent MobileNet with Attention Network (ViT‐ARMAN). The parameters optimization of the ViT‐ARMAN is done using the Opposition‐based Golden Tortoise Beetle Optimizer (OGTBO). Entropy coding is used in the training phase of the developed work to find the bit rate of residual images. Extensive experiments were conducted to demonstrate the effectiveness of the developed deep learning‐based method for video compression and bit rate reduction under VVC. |
---|---|
ISSN: | 0824-7935 1467-8640 |
DOI: | 10.1111/coin.70014 |