Loading…

PiClick: Picking the desired mask from multiple candidates in click-based interactive segmentation

Click-based interactive segmentation aims to generate target masks via human clicking, which facilitates efficient pixel-level annotation and image editing. In such a task, target ambiguity remains a problem hindering the accuracy and efficiency of segmentation. That is, in scenes with rich context,...

Full description

Saved in:
Bibliographic Details
Published in:Neurocomputing (Amsterdam) 2024-09, Vol.599, p.128083, Article 128083
Main Authors: Yan, Cilin, Wang, Haochen, Liu, Jie, Jiang, Xiaolong, Hu, Yao, Tang, Xu, Kang, Guoliang, Gavves, Efstratios
Format: Article
Language:English
Subjects:
Citations: Items that this one cites
Online Access:Get full text
Tags: Add Tag
No Tags, Be the first to tag this record!
Description
Summary:Click-based interactive segmentation aims to generate target masks via human clicking, which facilitates efficient pixel-level annotation and image editing. In such a task, target ambiguity remains a problem hindering the accuracy and efficiency of segmentation. That is, in scenes with rich context, one click may correspond to multiple potential targets, while most previous interactive segmentors only generate a single mask and fail to deal with target ambiguity. In this paper, we propose a novel interactive segmentation network named PiClick, to yield all potentially reasonable masks and suggest the most plausible one for the user. Specifically, PiClick utilizes a Transformer-based architecture to generate all potential target masks by mutually interactive mask queries. Moreover, a Target Reasoning module is designed in PiClick to automatically suggest the user-desired mask from all candidates, relieving target ambiguity and extra-human efforts. Extensive experiments on 9 interactive segmentation datasets demonstrate PiClick performs favorably against previous state-of-the-arts considering the segmentation results. Moreover, we show that PiClick effectively reduces human efforts in annotating and picking the desired masks. To ease the usage and inspire future research, we release the source code of PiClick together with a plug-and-play annotation tool at https://github.com/cilinyan/PiClick. •We propose PiClick to generate multiple masks to mitigate the target ambiguity issue.•A Target Reasoning module (TRM) is designed to mimic human mask choice behavior.•PiClick outperforms prior methods on 9 interactive segmentation tasks.
ISSN:0925-2312
1872-8286
DOI:10.1016/j.neucom.2024.128083