Loading…
Real-time semantic segmentation with local spatial pixel adjustment
The research of semantic segmentation networks has achieved a significant breakthrough recently. However, most part of methods have difficulty in utilizing information generated at each stage, which resulting in pixel value dislocation and blurred boundaries for small-scale objects. To overcome thes...
Saved in:
Published in: | Image and vision computing 2022-07, Vol.123, p.104470, Article 104470 |
---|---|
Main Authors: | , , , , |
Format: | Article |
Language: | English |
Subjects: | |
Citations: | Items that this one cites Items that cite this one |
Online Access: | Get full text |
Tags: |
Add Tag
No Tags, Be the first to tag this record!
|
Summary: | The research of semantic segmentation networks has achieved a significant breakthrough recently. However, most part of methods have difficulty in utilizing information generated at each stage, which resulting in pixel value dislocation and blurred boundaries for small-scale objects. To overcome these challenges, a local spatial pixel adjustment network (LSPANet) is proposed in this paper, which mainly consists of a dual-branch decoding fusion (DDF) module and a spatial pixel cross-correlation (SPCC) block. Specifically, the DDF module takes the high-level and low-level feature maps with different stages as the input, and gradually eliminates the discrepancy in the information of the feature map to fuse a variety of information extracted in the encoder stage. The SPCC block adopts the horizontal spatial pixel adjustment (HSPA) module and the vertical spatial pixel adjustment (VSPA) module to capture the relationship of each pixel value in the local horizontal and vertical space respectively, and then assign the importance to all values based on this relationship. LSPANet is evaluated on Cityscapes and Camvid datasets. The experimental results show that our network achieves 77.1% mIoU with 2 M parameters on the challenging Cityscapes dataset and the inference speed exceeds 30 FPS in a single GTX 2080 Ti GPU.
•Present dual-branch decoding fusion module to fuse a variety of information.•Propose spatial pixel cross-correlation block to capture relationship in local space.•Design a local spatial pixel adjustment network for real-time semantic segmentation. |
---|---|
ISSN: | 0262-8856 1872-8138 |
DOI: | 10.1016/j.imavis.2022.104470 |