Loading…
Adaptive Human-Centric Video Compression for Humans and Machines
We propose a novel framework to compress human-centric videos for both human viewing and machine analytics. Our system uses three coding branches to combine the power of generic face-prior learning with data-dependent detail recovery. The generic branch embeds faces into a discrete code space descri...
Saved in:
Main Authors: | , , |
---|---|
Format: | Conference Proceeding |
Language: | English |
Subjects: | |
Online Access: | Request full text |
Tags: |
Add Tag
No Tags, Be the first to tag this record!
|
Summary: | We propose a novel framework to compress human-centric videos for both human viewing and machine analytics. Our system uses three coding branches to combine the power of generic face-prior learning with data-dependent detail recovery. The generic branch embeds faces into a discrete code space described by a learned high-quality (HQ) codebook, to reconstruct an HQ baseline face. The domain-adaptive branch adjusts reconstruction to fit the current data domain by adding domain-specific information through a supplementary codebook. The task-adaptive branch derives assistive details from a low-quality (LQ) input to help machine analytics on the restored face. Adaptive weights are introduced to balance the use of domain-adaptive and task-adaptive features in reconstruction, driving trade-offs among criteria including perceptual quality, fidelity, bitrate, and task accuracy. Moreover, the proposed online learning mechanism automatically adjusts the adaptive weights according to the actual compression needs. By sharing the main generic branch, our framework can extend to multiple data domains and multiple tasks more flexibly compared to conventional coding schemes. Our experiments demonstrate that at very low bitrates we can restore faces with high perceptual quality for human viewing while maintaining high recognition accuracy for machine use. |
---|---|
ISSN: | 2160-7516 |
DOI: | 10.1109/CVPRW59228.2023.00119 |