Loading…

Self-Balanced R-CNN for instance segmentation

Current state-of-the-art two-stage models on instance segmentation task suffer from several types of imbalances. In this paper, we address the Intersection over the Union (IoU) distribution imbalance of positive input Regions of Interest (RoIs) during the training of the second stage. Our Self-Balan...

Full description

Saved in:
Bibliographic Details
Published in:Journal of visual communication and image representation 2022-08, Vol.87, p.103595, Article 103595
Main Authors: Rossi, Leonardo, Karimi, Akbar, Prati, Andrea
Format: Article
Language:English
Subjects:
Citations: Items that this one cites
Items that cite this one
Online Access:Get full text
Tags: Add Tag
No Tags, Be the first to tag this record!
Description
Summary:Current state-of-the-art two-stage models on instance segmentation task suffer from several types of imbalances. In this paper, we address the Intersection over the Union (IoU) distribution imbalance of positive input Regions of Interest (RoIs) during the training of the second stage. Our Self-Balanced R-CNN (SBR-CNN), an evolved version of the Hybrid Task Cascade (HTC) model, brings brand new loop mechanisms of bounding box and mask refinements. With an improved Generic RoI Extraction (GRoIE), we also address the feature-level imbalance at the Feature Pyramid Network (FPN) level, originated by a non-uniform integration between low- and high-level features from the backbone layers. In addition, the redesign of the architecture heads toward a fully convolutional approach with FCC further reduces the number of parameters and obtains more clues to the connection between the task to solve and the layers used. Moreover, our SBR-CNN model shows the same or even better improvements if adopted in conjunction with other state-of-the-art models. In fact, with a lightweight ResNet-50 as backbone, evaluated on COCO minival 2017 dataset, our model reaches 45.3% and 41.5% AP for object detection and instance segmentation, with 12 epochs and without extra tricks. The code is available at https://github.com/IMPLabUniPr/mmdetection/tree/sbr_cnn. •A deep analysis of the IoU Distribution Imbalance (IDI) problem in the RPN proposals.•A brand new loop architecture for detection and segmentation heads.•Redesign of the model heads (FCC) toward a fully convolutional approach.•A better performing GRoIE model for extraction of RoIs in a two-stage architecture.•The proposal of SBR-CNN, which includes R̂3-CNN, FCC and GRoIE.
ISSN:1047-3203
1095-9076
DOI:10.1016/j.jvcir.2022.103595