Loading…

SURGIVID: Annotation-Efficient Surgical Video Object Discovery

Surgical scenes convey crucial information about the quality of surgery. Pixel-wise localization of tools and anatomical structures is the first task towards deeper surgical analysis for microscopic or endoscopic surgical views. This is typically done via fully-supervised methods which are annotatio...

Full description

Saved in:

Bibliographic Details
Published in:	arXiv.org 2024-09
Main Authors:	Köksal, Çağhan, Ghazal Ghazaei, Navab, Nassir
Format:	Article
Language:	English
Subjects:	Ablation Annotations Image annotation Image segmentation Labels Localization Semantic segmentation Surgical instruments Video
Online Access:	Get full text
Tags:	Add Tag No Tags, Be the first to tag this record!

Description
Summary:	Surgical scenes convey crucial information about the quality of surgery. Pixel-wise localization of tools and anatomical structures is the first task towards deeper surgical analysis for microscopic or endoscopic surgical views. This is typically done via fully-supervised methods which are annotation greedy and in several cases, demanding medical expertise. Considering the profusion of surgical videos obtained through standardized surgical workflows, we propose an annotation-efficient framework for the semantic segmentation of surgical scenes. We employ image-based self-supervised object discovery to identify the most salient tools and anatomical structures in surgical videos. These proposals are further refined within a minimally supervised fine-tuning step. Our unsupervised setup reinforced with only 36 annotation labels indicates comparable localization performance with fully-supervised segmentation models. Further, leveraging surgical phase labels as weak labels can better guide model attention towards surgical tools, leading to \(\sim 2\%\) improvement in tool localization. Extensive ablation studies on the CaDIS dataset validate the effectiveness of our proposed solution in discovering relevant surgical objects with minimal or no supervision.
ISSN:	2331-8422