Loading…

Development of an initial training and evaluation programme for manual lower limb muscle MRI segmentation

Background Magnetic resonance imaging (MRI) quantification of intramuscular fat accumulation is a responsive biomarker in neuromuscular diseases. Despite emergence of automated methods, manual muscle segmentation remains an essential foundation. We aimed to develop a training programme for new obser...

Full description

Saved in:
Bibliographic Details
Published in:European radiology experimental 2024-07, Vol.8 (1), p.85-12, Article 85
Main Authors: Morrow, Jasper M., Shah, Sachit, Cristiano, Lara, Evans, Matthew R. B., Doherty, Carolynne M., Alnaemi, Talal, Saab, Abeer, Emira, Ahmed, Klickovic, Uros, Hammam, Ahmed, Altuwaijri, Afnan, Wastling, Stephen, Reilly, Mary M., Hanna, Michael G., Yousry, Tarek A., Thornton, John S.
Format: Article
Language:English
Subjects:
Citations: Items that this one cites
Online Access:Get full text
Tags: Add Tag
No Tags, Be the first to tag this record!
Description
Summary:Background Magnetic resonance imaging (MRI) quantification of intramuscular fat accumulation is a responsive biomarker in neuromuscular diseases. Despite emergence of automated methods, manual muscle segmentation remains an essential foundation. We aimed to develop a training programme for new observers to demonstrate competence in lower limb muscle segmentation and establish reliability benchmarks for future human observers and machine learning segmentation packages. Methods The learning phase of the training programme comprised a training manual, direct instruction, and eight lower limb MRI scans with reference standard large and small regions of interest (ROIs). The assessment phase used test–retest scans from two patients and two healthy controls. Interscan and interobserver reliability metrics were calculated to identify underperforming outliers and to determine competency benchmarks. Results Three experienced observers undertook the assessment phase, whilst eight new observers completed the full training programme. Two of the new observers were identified as underperforming outliers, relating to variation in size or consistency of segmentations; six had interscan and interobserver reliability equivalent to those of experienced observers. The calculated benchmark for the Sørensen-Dice similarity coefficient between observers was greater than 0.87 and 0.92 for individual thigh and calf muscles, respectively. Interscan and interobserver reliability were significantly higher for large than small ROIs (all p  
ISSN:2509-9280
2509-9280
DOI:10.1186/s41747-024-00475-9