Loading…
Scale-aware spatial pyramid pooling with both encoder-mask and scale-attention for semantic segmentation
•A scale-aware SPP module is proposed for selecting scale feature for each pixel.•An encoder-mask module is proposed to extract sharper boundary information.•An attention module is proposed to capture long-range dependencies. This paper focuses on semantic segmentation of scenes by capturing the app...
Saved in:
Published in: | Neurocomputing (Amsterdam) 2020-03, Vol.383, p.174-182 |
---|---|
Main Authors: | , , |
Format: | Article |
Language: | English |
Subjects: | |
Citations: | Items that this one cites Items that cite this one |
Online Access: | Get full text |
Tags: |
Add Tag
No Tags, Be the first to tag this record!
|
Summary: | •A scale-aware SPP module is proposed for selecting scale feature for each pixel.•An encoder-mask module is proposed to extract sharper boundary information.•An attention module is proposed to capture long-range dependencies.
This paper focuses on semantic segmentation of scenes by capturing the appropriate scale, rich detail and contextual dependencies in a feature representation. Semantic segmentation is a pixel-level classification task and has made steady progress on the basis of fully convolutional networks (FCNs). However, we find there is still room for improvement in the following aspects. The first is that the pixel itself has not enough information for semantic prediction, it needs to look around to determine which category it belongs to. However, the fixed size of the receptive field defined by the network is not suitable for all pixels when an image contains objects with various scales. The second is that the extracted scale-aware features do not handle sharper object boundaries due to low-resolution. The final aspect is regarding the ability to model long-range dependencies. In order to solve the above challenges, in this paper we propose three modules: Scale-aware spatial pyramid pool module, Encoder mask module and Scale-attention module (SSPP-ES). Extensive experiments on the Cityscapes and ADE20K benchmarks demonstrate the effectiveness of our approach for semantic segmentation. |
---|---|
ISSN: | 0925-2312 1872-8286 |
DOI: | 10.1016/j.neucom.2019.11.042 |