Loading…

Efficient data partitioning for the GPU computation of moment functions

In our previous work, we have provided tools for an efficient characterization of biomedical images using Legendre and Zernike moments, showing their relevance as biomarkers for classifying image tiles coming from bone tissue regeneration studies (Ujaldón, 2009) [24]. As part of our research quest f...

Full description

Saved in:
Bibliographic Details
Published in:Journal of parallel and distributed computing 2014-01, Vol.74 (1), p.1994-2004
Main Authors: Martín Requena, Manuel Jesús, Moscato, Pablo, Ujaldón, Manuel
Format: Article
Language:English
Subjects:
Citations: Items that this one cites
Items that cite this one
Online Access:Get full text
Tags: Add Tag
No Tags, Be the first to tag this record!
Description
Summary:In our previous work, we have provided tools for an efficient characterization of biomedical images using Legendre and Zernike moments, showing their relevance as biomarkers for classifying image tiles coming from bone tissue regeneration studies (Ujaldón, 2009) [24]. As part of our research quest for efficiency, we developed methods for accelerating those computations on GPUs (Martín-Requena and Ujaldón, 2011) [10,9]. This new stage of our work focuses on the efficient data partitioning to optimize the execution on many-cores and clusters of GPUs to attain gains up to three orders of magnitude when compared to the execution on multi-core CPUs of similar age and cost using 1 Mpixel images. We deploy a successive and successful chain of optimizations which exploit symmetries in trigonometric functions and access patterns to image pixels which are effectively combined with massive data parallelism on GPUs to enable (1) real-time processing for our set of input biomedical images, and (2) the use of high-resolution images in clinical practice. •We present a novel approach for computing Zernike moments on GPUs which leads to remarkable acceleration factors.•We exploit symmetries within the computational space to obtain an additional 3.55x speedup.•We extend our methodology to multi-GPU systems and GPU clusters.
ISSN:0743-7315
1096-0848
DOI:10.1016/j.jpdc.2013.07.008