Loading…
Algorithm optimization and hardware implementation for Merge mode in HEVC
Merge mode is a new tool for improving inter-frame coding efficiency in high-efficiency video coding. This tool can save the bitrate for the motion vector by sharing this vector with neighboring blocks. Merge is a process that selects a candidate motion vector by calculating the cost of rate-distort...
Saved in:
Published in: | Journal of real-time image processing 2020-06, Vol.17 (3), p.623-630 |
---|---|
Main Authors: | , , , , |
Format: | Article |
Language: | English |
Subjects: | |
Citations: | Items that this one cites Items that cite this one |
Online Access: | Get full text |
Tags: |
Add Tag
No Tags, Be the first to tag this record!
|
Summary: | Merge mode is a new tool for improving inter-frame coding efficiency in high-efficiency video coding. This tool can save the bitrate for the motion vector by sharing this vector with neighboring blocks. Merge is a process that selects a candidate motion vector by calculating the cost of rate-distortion. However, this process requires a large number of complex computations and memory access, thereby resulting in the low efficiency of hardware implementation. This paper proposes a new Merge candidate decision scheme that determines the most favorable Merge candidate from a full list of candidates by comparing the sum of absolute transformed difference with the weighted header bit instead of performing a complex calculation for sum of squared difference and entropy coding process in HM16.7. The simulation results show that the performance of the proposed algorithm is close to that of HM16.7 and increases the BD-rate only by 0.22–1.21%. The multilevel pipelines architecture is also exploited in the hardware design. The weighted header bit operation is performed by using the look-up table, which reduces both the complexity and encoding clock cycle. The designed system is implemented with a register transfer level code. The synthesis results from the Design Compiler show that compared with other architecture, the proposed architecture offers great advantages in resource utilization and can process 1920 × 1080 at 353 frame/s for P-slices with a clock frequency of 1057 MHz and logic gate count of 285.2 K. |
---|---|
ISSN: | 1861-8200 1861-8219 |
DOI: | 10.1007/s11554-018-0818-4 |