Loading…

An encoder‐decoder framework with dynamic convolution for weakly supervised instance segmentation

In the systems of industrial robotics and autonomous vehicles, instance segmentation is widely employed. However, manually labelling an object outline is time‐consuming. In order to reduce annotation costs, we present a weakly supervised instance segmentation method in this article. A deeply convolu...

Full description

Saved in:
Bibliographic Details
Published in:IET computer vision 2023-12, Vol.17 (8), p.883-894
Main Authors: Zhu, Liangjun, Peng, Li, Ding, Shuchen, Liu, Zhongren
Format: Article
Language:English
Subjects:
Citations: Items that this one cites
Online Access:Get full text
Tags: Add Tag
No Tags, Be the first to tag this record!
Description
Summary:In the systems of industrial robotics and autonomous vehicles, instance segmentation is widely employed. However, manually labelling an object outline is time‐consuming. In order to reduce annotation costs, we present a weakly supervised instance segmentation method in this article. A deeply convolutional network is first used to construct multi‐scale feature maps for each object in the input image. After that, the encoder‐decoder framework with dynamic convolution is utilised to enhance model capacity and efficiency, while avoiding the issues of anchor design, proposal selection, and RoIAlign implementation. In particular, Dynamic Heads are used in the encoder to create dynamic convolution kernels, while Instance Heads are used in the decoder to provide the global feature map. With dynamic convolution, each instance can be segmented independently, reducing interference with other instances and improving segmentation accuracy. Under the supervision of projection loss and pixel point colour pairing loss, the contours of each object are finally outlined. On the PASCAL VOC and MS COCO datasets, the proposed method is competitive with more sophisticated approaches. In the VOC dataset, segmentation performance achieved 37.6% average precision with ResNet‐101 and FPN networks. The extensively visualised results demonstrate the effectiveness of the proposed encoder‐decoder framework with dynamic convolution. To efficiently and effectively maximise utilisation of global image information, we propose a novel method named encoder‐decoder framework with dynamic convolution (EDDC) for weakly instance segmentation. It primarily consists of subnetworks of the backbone and neck, as well as the Dynamic Head and Instance Head. With only box‐level supervision, EDDC significantly reduced the costs of annotations and produced high‐quality segmentation results.
ISSN:1751-9632
1751-9640
DOI:10.1049/cvi2.12202