Loading…

Fast Gaussian Filter Approximations Comparison on SIMD Computing Platforms

Gaussian filtering, being a convolution with a Gaussian kernel, is a widespread technique in image analysis and computer vision applications. It is the traditional approach for noise reduction. In some cases, performing the exact convolution can be computationally expensive and time-consuming. To ad...

Full description

Saved in:

Bibliographic Details
Published in:	Applied sciences 2024-06, Vol.14 (11), p.4664
Main Authors:	Rybakova, Ekaterina O, Limonova, Elena E, Nikolaev, Dmitry P
Format:	Article
Language:	English
Subjects:	Accuracy Algorithms Approximation approximations computational efficiency Decomposition Efficiency Electric filters Gaussian smoothing image filtering Image processing impulse response Machine vision Medical imaging equipment Noise control Number systems quantization Remote sensing
Citations:	Items that this one cites
Online Access:	Get full text
Tags:	Add Tag No Tags, Be the first to tag this record!

Description
Summary:	Gaussian filtering, being a convolution with a Gaussian kernel, is a widespread technique in image analysis and computer vision applications. It is the traditional approach for noise reduction. In some cases, performing the exact convolution can be computationally expensive and time-consuming. To address this problem, approximations of the convolution are often used to achieve a balance between accuracy and computational efficiency, such as with running sums, Bell blur, Deriche approximation, etc. At the same time, modern computing devices support data parallelism (vectorization) via Single Instruction Multiple Data (SIMD) and can process integer numbers faster than floating-point approaches. In this paper, we describe several methods for approximating a Gaussian filter, implement the SIMD and quantized versions, and compare them in terms of speed and accuracy. The experiments were performed on central processing units with a x86_64 architecture using a family of SSE SIMD extensions and an ARMv8 architecture using the NEON SIMD extension. All the optimized approximations demonstrated 10–20× speedup while maintaining the accuracy in the range of 1 × 10−5 or higher. The fastest method is a trivial Stack blur with a relatively high error, so we recommend using the second-order Vliet–Young–Verbeek filter and quantized Bell blur and running sums as more accurate and still computationally efficient alternatives.
ISSN:	2076-3417 2076-3417
DOI:	10.3390/app14114664