Loading…
The Effect of Quality Control on Accuracy of Digital Pathology Image Analysis
Digital slide images produced from routine diagnostic histopathological preparations suffer from variation arising at every step of the processing pipeline. Typically, pathologists compensate for such variation using expert knowledge and experience, which is difficult to replicate in automated solut...
Saved in:
Published in: | IEEE journal of biomedical and health informatics 2021-02, Vol.25 (2), p.307-314 |
---|---|
Main Authors: | , , , , |
Format: | Article |
Language: | English |
Subjects: | |
Citations: | Items that this one cites Items that cite this one |
Online Access: | Get full text |
Tags: |
Add Tag
No Tags, Be the first to tag this record!
|
Summary: | Digital slide images produced from routine diagnostic histopathological preparations suffer from variation arising at every step of the processing pipeline. Typically, pathologists compensate for such variation using expert knowledge and experience, which is difficult to replicate in automated solutions. The extent to which inconsistencies affect image analysis is explored in this work, examining in detail, the results from a previously published algorithm automating the generation of tumor:stroma ratio (TSR) in colorectal clinical trial datasets. One dataset consisting of 2,211 cases and 106,268 expert-labelled images is used to identify quality issues, by visually inspecting cases where algorithm-pathologist agreement is lowest. Twelve categories are identified and used to analyze pathologist-algorithm agreement in relation to these categories. Of the 2,211 cases, 701 were found to be free from any image quality issues. Algorithm performance was then assessed, comparing pathologist agreement with image quality classification. It was found that agreement was lowest on poorly differentiated tissue, with a mean TSR difference of 0.25 (sd = 0.24). Removing images that contained quality issues increased accuracy from 80% to 83%, at the expense of reducing the dataset to 33,736 images (32%). Training the algorithm on the optimized dataset, prior to testing on all images saw a decrease in accuracy of 4%, indicating that the optimized dataset did not contain enough variation to generate a fully representative model. The results provide an in-depth perspective on image quality, highlighting the importance of the effects on downstream image analysis. |
---|---|
ISSN: | 2168-2194 2168-2208 |
DOI: | 10.1109/JBHI.2020.3046094 |