Loading…

Evaluation of generalization ability for deep learning‐based auto‐segmentation accuracy in limited field of view CBCT of male pelvic region

Purpose The aim of this study was to evaluate generalization ability of segmentation accuracy for limited FOV CBCT in the male pelvic region using a full‐image CNN. Auto‐segmentation accuracy was evaluated using various datasets with different intensity distributions and FOV sizes. Methods A total o...

Full description

Saved in:
Bibliographic Details
Published in:Journal of applied clinical medical physics 2023-05, Vol.24 (5), p.e13912-n/a
Main Authors: Hirashima, Hideaki, Nakamura, Mitsuhiro, Imanishi, Keiho, Nakao, Megumi, Mizowaki, Takashi
Format: Article
Language:English
Subjects:
Citations: Items that this one cites
Items that cite this one
Online Access:Get full text
Tags: Add Tag
No Tags, Be the first to tag this record!
Description
Summary:Purpose The aim of this study was to evaluate generalization ability of segmentation accuracy for limited FOV CBCT in the male pelvic region using a full‐image CNN. Auto‐segmentation accuracy was evaluated using various datasets with different intensity distributions and FOV sizes. Methods A total of 171 CBCT datasets from patients with prostate cancer were enrolled. There were 151, 10, and 10 CBCT datasets acquired from Vero4DRT, TrueBeam STx, and Clinac‐iX, respectively. The FOV for Vero4DRT, TrueBeam STx, and Clinac‐iX was 20, 26, and 25 cm, respectively. The ROIs, including the bladder, prostate, rectum, and seminal vesicles, were manually delineated. The U2‐Net CNN network architecture was used to train the segmentation model. A total of 131 limited FOV CBCT datasets from Vero4DRT were used for training (104 datasets) and validation (27 datasets); thereafter the rest were for testing. The training routine was set to save the best weight values when the DSC in the validation set was maximized. Segmentation accuracy was qualitatively and quantitatively evaluated between the ground truth and predicted ROIs in the different testing datasets. Results The mean scores ± standard deviation of visual evaluation for bladder, prostate, rectum, and seminal vesicle in all treatment machines were 1.0 ± 0.7, 1.5 ± 0.6, 1.4 ± 0.6, and 2.1 ± 0.8 points, respectively. The median DSC values for all imaging devices were ≥0.94 for the bladder, 0.84–0.87 for the prostate and rectum, and 0.48–0.69 for the seminal vesicles. Although the DSC values for the bladder and seminal vesicles were significantly different among the three imaging devices, the DSC value of the bladder changed by less than 1% point. The median MSD values for all imaging devices were ≤1.2 mm for the bladder and 1.4–2.2 mm for the prostate, rectum, and seminal vesicles. The MSD values for the seminal vesicles were significantly different between the three imaging devices. Conclusion The proposed method is effective for testing datasets with different intensity distributions and FOV from training datasets.
ISSN:1526-9914
1526-9914
DOI:10.1002/acm2.13912