Loading…
LRDNet: A lightweight and efficient network with refined dual attention decorder for real-time semantic segmentation
•A lightweight and efficient neural network for real-time semantic segmentation.•An efficient split convolution increase the speed of inference and improve the accuracy of feature extraction.•A refined dual attention mechanism can reduce the complexity of the model and improve the accuracy.•Our LRDN...
Saved in:
Published in: | Neurocomputing (Amsterdam) 2021-10, Vol.459, p.349-360 |
---|---|
Main Authors: | , , , , , |
Format: | Article |
Language: | English |
Subjects: | |
Citations: | Items that this one cites Items that cite this one |
Online Access: | Get full text |
Tags: |
Add Tag
No Tags, Be the first to tag this record!
|
Summary: | •A lightweight and efficient neural network for real-time semantic segmentation.•An efficient split convolution increase the speed of inference and improve the accuracy of feature extraction.•A refined dual attention mechanism can reduce the complexity of the model and improve the accuracy.•Our LRDNet demonstrates a good trade-off in the parameter size, computational cost and accuracy on the Cityscapes dataset.•With a parameter value below 0.66 M, our model can be up to 77 FPS on GTX 1080ti.
Most of the current popular semantic segmentation convolutional networks are focus on accuracy and require large amount of computation, which is using complex models. In order to realize real-time performance in practical applications, such as embedded systems and mobile devices, lightweight semantic segmentation has become a new need, where the network model should keep good accuracy in very limited computing budget. In this paper, we propose a lightweight network with the refined dual attention decorder (termed LRDNet) for better balance between computational speed and segmentation accuracy. In the encoding part of LRDNet, we offer an asymmetric module based on the residual network for lightweight and efficiency. In this module, a combination of decomposition convolution and deep convolution is used to improve the efficiency of feature extraction. In the decoding part of LRDNet, we use a refined dual attention mechanism to reduce the complexity of the entire network. Our network attained precise real-time segmentation results on Cityscapes, CamVid datasets. Without additional processing and pretraining, the LRDNet model achieves 70.1 Mean IoU in the Cityscapes test set. With a parameter value below 0.66 M, it can be up to 77 FPS. |
---|---|
ISSN: | 0925-2312 1872-8286 |
DOI: | 10.1016/j.neucom.2021.07.019 |