Loading…
Optimizing a medical image registration algorithm based on profiling data for real-time performance
Image registration is a commonly task in medical image analysis. Therefore, a significant number of algorithms have been developed to perform rigid and non-rigid image registration. Particularly, the free-form deformation algorithm is frequently used to carry out non-rigid registration task; however...
Saved in:
Published in: | Multimedia tools and applications 2022-01, Vol.81 (2), p.2603-2620 |
---|---|
Main Authors: | , , |
Format: | Article |
Language: | English |
Subjects: | |
Citations: | Items that this one cites Items that cite this one |
Online Access: | Get full text |
Tags: |
Add Tag
No Tags, Be the first to tag this record!
|
Summary: | Image registration is a commonly task in medical image analysis. Therefore, a significant number of algorithms have been developed to perform rigid and non-rigid image registration. Particularly, the free-form deformation algorithm is frequently used to carry out non-rigid registration task; however, it is a computationally very intensive algorithm. In this work, we describe an approach based on profiling data to identify potential parts of this algorithm for which parallel implementations can be developed. The proposed approach assesses the efficient of the algorithm by applying performance analysis techniques commonly available in traditional computer operating systems. Hence, this article provides guidelines to support researchers working on medical image processing and analysis to achieve real-time non-rigid image registration applications using common computing systems. According to our experimental findings, significant speedups can be accomplished by parallelizing sequential snippets, i.e., code regions that are executed more than once. For the selected costly functions previously identified in the studied free-form deformation algorithm, the developed parallelization decreased the runtime by up to seven times relatively to the related single thread based implementation. The implementations were developed based on the Open Multi-Processing application programming interface. In conclusion, this study confirms that based on the call graph visualization and detected performance bottlenecks, one can easily find and evaluate snippets which are potential optimization targets in addition to throughput in memory accesses. |
---|---|
ISSN: | 1380-7501 1573-7721 |
DOI: | 10.1007/s11042-021-11699-x |