Loading…
An Efficient Multi-Scale Feature Compression with QP-Adaptive Feature Channel Truncation for Video Coding for Machines
Machine vision-based intelligent applications that analyze video data collected by machines are rapidly increasing. Therefore, it is essential to efficiently compress a large volume of video data for machine consumption. Accordingly, the Moving Picture Experts Group (MPEG) has been developing a new...
Saved in:
Published in: | IEEE access 2023-01, Vol.11, p.1-1 |
---|---|
Main Authors: | , , , , |
Format: | Article |
Language: | English |
Subjects: | |
Citations: | Items that this one cites Items that cite this one |
Online Access: | Get full text |
Tags: |
Add Tag
No Tags, Be the first to tag this record!
|
Summary: | Machine vision-based intelligent applications that analyze video data collected by machines are rapidly increasing. Therefore, it is essential to efficiently compress a large volume of video data for machine consumption. Accordingly, the Moving Picture Experts Group (MPEG) has been developing a new video coding standard called Video Coding for Machines (VCM), aimed at video consumed by machines rather than humans. Recently, studies have demonstrated that multi-scale feature compression (MSFC)-based feature compression methods significantly improve the performance of MPEG-VCM. This paper proposes an efficient MSFC (eMSFC) method with quantization parameter (QP)-adaptive feature channel truncation. The proposed eMSFC incorporates an MSFC network with a selective learning strategy (SLS) and Versatile Video Coding (VVC)-based compression. The SLS extracts a single-scale feature from the input image, arranged in order of channel-wise importance. The size of the single-scale feature is adaptively adjusted by truncating the feature channels according to the QP. The truncated feature is efficiently compressed using VVC. Compared to the VCM feature anchor, the experimental results reveal that the proposed method provides a 98.72%, 98.34%, and 98.04% Bjontegaard delta rate gain for machine vision tasks of instance segmentation, object detection, and object tracking, respectively. The proposed method performed best among the "Call for Evidence" response technologies in MPEG-VCM. |
---|---|
ISSN: | 2169-3536 2169-3536 |
DOI: | 10.1109/ACCESS.2023.3307404 |