Loading…
Fast Block Motion Estimation With 8-Bit Partial Sums Using SIMD Architectures
In order to take advantage of the byte-type data parallelism in the existing single-instruction multiple-data (SIMD) technique, this paper introduces the concept of 8-bit partial sums, obtained by a 4-bit right-shift operation on the sum of the 16 luminance values in a column of a 16 x 16 block of a...
Saved in:
Published in: | IEEE transactions on circuits and systems for video technology 2007-08, Vol.17 (8), p.1041-1053 |
---|---|
Main Authors: | , , |
Format: | Article |
Language: | English |
Subjects: | |
Citations: | Items that this one cites Items that cite this one |
Online Access: | Get full text |
Tags: |
Add Tag
No Tags, Be the first to tag this record!
|
Summary: | In order to take advantage of the byte-type data parallelism in the existing single-instruction multiple-data (SIMD) technique, this paper introduces the concept of 8-bit partial sums, obtained by a 4-bit right-shift operation on the sum of the 16 luminance values in a column of a 16 x 16 block of a video frame. Since these partial sums are of only eight bits, eight of them can be processed concurrently in a single 64-bit SIMD register. A method of employing these partial sums in order to speed up a given block motion-estimation algorithm is then proposed. The notion of the 8-bit partial sums is extended to the four-level case. It is shown that there are 15 possible methods of utilizing these multilevel 8-bit partial sums to accelerate a block motion-estimation algorithm without any loss of accuracy of the algorithm. Each of these 15 methods is used in the full-search algorithm to determine the one that provides the lowest computational complexity. This method is adopted as the chosen scheme to accelerate various block motion-estimation algorithms. Extensive simulations are carried out on eight video sequences showing that substantial speed-up can be achieved when the chosen scheme is incorporated with the various motion-estimation algorithms. The simulation results also demonstrate that the implementation on SIMD architectures can further accelerate the execution of the proposed scheme by more than 93% percent. |
---|---|
ISSN: | 1051-8215 1558-2205 |
DOI: | 10.1109/TCSVT.2007.898645 |