Loading…

Parallel content-based sub-image retrieval using hierarchical searching

The capacity to systematically search through large image collections and ensembles and detect regions exhibiting similar morphological characteristics is central to pathology diagnosis. Unfortunately, the primary methods used to search digitized, whole-slide histopathology specimens are slow and pr...

Full description

Saved in:
Bibliographic Details
Published in:Bioinformatics 2014-04, Vol.30 (7), p.996-1002
Main Authors: Yang, Lin, Qi, Xin, Xing, Fuyong, Kurc, Tahsin, Saltz, Joel, Foran, David J
Format: Article
Language:English
Subjects:
Citations: Items that this one cites
Items that cite this one
Online Access:Get full text
Tags: Add Tag
No Tags, Be the first to tag this record!
Description
Summary:The capacity to systematically search through large image collections and ensembles and detect regions exhibiting similar morphological characteristics is central to pathology diagnosis. Unfortunately, the primary methods used to search digitized, whole-slide histopathology specimens are slow and prone to inter- and intra-observer variability. The central objective of this research was to design, develop, and evaluate a content-based image retrieval system to assist doctors for quick and reliable content-based comparative search of similar prostate image patches. Given a representative image patch (sub-image), the algorithm will return a ranked ensemble of image patches throughout the entire whole-slide histology section which exhibits the most similar morphologic characteristics. This is accomplished by first performing hierarchical searching based on a newly developed hierarchical annular histogram (HAH). The set of candidates is then further refined in the second stage of processing by computing a color histogram from eight equally divided segments within each square annular bin defined in the original HAH. A demand-driven master-worker parallelization approach is employed to speed up the searching procedure. Using this strategy, the query patch is broadcasted to all worker processes. Each worker process is dynamically assigned an image by the master process to search for and return a ranked list of similar patches in the image. The algorithm was tested using digitized hematoxylin and eosin (H&E) stained prostate cancer specimens. We have achieved an excellent image retrieval performance. The recall rate within the first 40 rank retrieved image patches is ∼90%. Both the testing data and source code can be downloaded from http://pleiad.umdnj.edu/CBII/Bioinformatics/.
ISSN:1367-4803
1367-4811
1460-2059
DOI:10.1093/bioinformatics/btt623