Loading…
SED: Searching Enhanced Decoder with switchable skip connection for semantic segmentation
Neural architecture search (NAS) has shown excellent performance. However, existing semantic segmentation models rely heavily on pre-training on Image-Net or COCO and mainly focus on the designing of decoders. Directly training the encoder–decoder architecture search models from scratch to SOTA for...
Saved in:
Published in: | Pattern recognition 2024-05, Vol.149, p.110196, Article 110196 |
---|---|
Main Authors: | , , , , |
Format: | Article |
Language: | English |
Subjects: | |
Citations: | Items that this one cites Items that cite this one |
Online Access: | Get full text |
Tags: |
Add Tag
No Tags, Be the first to tag this record!
|
Summary: | Neural architecture search (NAS) has shown excellent performance. However, existing semantic segmentation models rely heavily on pre-training on Image-Net or COCO and mainly focus on the designing of decoders. Directly training the encoder–decoder architecture search models from scratch to SOTA for semantic segmentation requires even thousands GPU days, which greatly limits the application of NAS. To address this issue, we propose a novel neural architecture Search framework for Enhanced Decoder (SED). Utilizing the pre-trained hand-designing backbone and the searching space composed of light-weight cells, SED searches for a decoder which can perform high-quality segmentation. Furthermore, we attach switchable skip connection operations to search space, expanding the diversity of possible network structure. The parameters of backbone and operations selected in searching phrase are copied to retraining process. As a result, searching, pruning and retraining can be done in just 1 day. The experimental results show that the SED proposed in this paper only needs 1/4 of the parameters and calculation in contrast to hand-designing decoder, and obtains higher segmentation accuracy on Cityscapes. Transferring the same decoder architecture to other datasets, such as: Pascal VOC 2012, Camvid, ADE20K proves the robustness of SED.
•For the task of image semantic segmentation, we propose a gradient-based, pre-trainable neural network architecture search framework SED. In this paper we simultaneously considering decoder and skip connection search. Our method maximizes the advantages of NAS and pre- trained backbone.•SED can compress the retraining iterations to several thousands. The whole searching, pruning, retraining process can be compressed into 1 day. Furthermore, after searching on Cityscapes, the searched network architecture can achieve 80.2% mIoU. |
---|---|
ISSN: | 0031-3203 1873-5142 |
DOI: | 10.1016/j.patcog.2023.110196 |